include/linux/bpf-cgroup.h | 9 +- include/linux/io_uring.h | 1 + include/net/sock.h | 4 + include/uapi/linux/io_uring.h | 8 + io_uring/uring_cmd.c | 55 ++++ kernel/bpf/cgroup.c | 25 +- net/core/sock.c | 8 - net/socket.c | 102 ++++--- tools/include/io_uring/mini_liburing.h | 282 ++++++++++++++++++ .../selftests/bpf/prog_tests/sockopt.c | 113 ++++++- tools/testing/selftests/net/Makefile | 1 + .../selftests/net/io_uring_zerocopy_tx.c | 268 +---------------- 12 files changed, 544 insertions(+), 332 deletions(-) create mode 100644 tools/include/io_uring/mini_liburing.h
This patchset adds support for getsockopt (SOCKET_URING_OP_GETSOCKOPT)
and setsockopt (SOCKET_URING_OP_SETSOCKOPT) in io_uring commands.
SOCKET_URING_OP_SETSOCKOPT implements generic case, covering all levels
and optnames. SOCKET_URING_OP_GETSOCKOPT is limited, for now, to
SOL_SOCKET level, which seems to be the most common level parameter for
get/setsockopt(2).
In order to keep the implementation (and tests) simple, some refactors
were done prior to the changes, as follows:
Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
become flexible enough to accept user or kernel pointers for optval/optlen.
Patch 3-4: Remove the core {s,g}etsockopt() core function from
__sys_{g,s}etsockopt, so, the code could be reused by other callers,
such as io_uring.
Patch 5: Pass compat mode to the file/socket callbacks
Patch 6: Move io_uring helpers from io_uring_zerocopy_tx to a generic
io_uring headers. This simplify the test case (last patch)
Patch 7: Protect io_uring_cmd_sock() to not be called if CONFIG_NET is
disabled.
PS1: For getsockopt command, the optlen field is not a userspace
pointers, but an absolute value, so this is slightly different from
getsockopt(2) behaviour. The new optlen value is returned in cqe->res.
PS2: The userspace pointers need to be alive until the operation is
completed.
These changes were tested with a new test[1] in liburing, LTP sockopt*
tests, as also with bpf/progs/sockopt test case, which is now adapted to
run using both system calls and io_uring commands.
[1] Link: https://github.com/leitao/liburing/blob/getsockopt/test/socket-getsetsock-cmd.c
RFC -> V1:
* Copy user memory at io_uring subsystem, and call proto_ops
callbacks using kernel memory
* Implement all the cases for SOCKET_URING_OP_SETSOCKOPT
V1 -> V2
* Implemented the BPF part
* Using user pointers from optval to avoid kmalloc in io_uring part.
V2 -> V3:
* Break down __sys_setsockopt and reuse the core code, avoiding
duplicated code. This removed the requirement to expose
sock_use_custom_sol_socket().
* Added io_uring test to selftests/bpf/sockopt.
* Fixed compat argument, by passing it to the issue_flags.
V3 -> V4:
* Rebase on top of commit 1ded5e5a5931b ("net: annotate data-races around sock->ops")
* Also broke down __sys_setsockopt() to reuse the core function
from io_uring.
* Create a new patch to return -EOPNOTSUPP if CONFIG_NET is
disabled
* Added two SOL_SOCKET tests in bpf/prog_tests/sockopt.c
Breno Leitao (10):
bpf: Leverage sockptr_t in BPF getsockopt hook
bpf: Leverage sockptr_t in BPF setsockopt hook
net/socket: Break down __sys_setsockopt
net/socket: Break down __sys_getsockopt
io_uring/cmd: Pass compat mode in issue_flags
selftests/net: Extract uring helpers to be reusable
io_uring/cmd: return -EOPNOTSUPP if net is disabled
io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT
selftests/bpf/sockopt: Add io_uring support
include/linux/bpf-cgroup.h | 9 +-
include/linux/io_uring.h | 1 +
include/net/sock.h | 4 +
include/uapi/linux/io_uring.h | 8 +
io_uring/uring_cmd.c | 55 ++++
kernel/bpf/cgroup.c | 25 +-
net/core/sock.c | 8 -
net/socket.c | 102 ++++---
tools/include/io_uring/mini_liburing.h | 282 ++++++++++++++++++
.../selftests/bpf/prog_tests/sockopt.c | 113 ++++++-
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/io_uring_zerocopy_tx.c | 268 +----------------
12 files changed, 544 insertions(+), 332 deletions(-)
create mode 100644 tools/include/io_uring/mini_liburing.h
--
2.34.1
On Mon, 4 Sep 2023 09:24:53 -0700 Breno Leitao wrote: > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions > become flexible enough to accept user or kernel pointers for optval/optlen. Have you seen: https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/ ? I wasn't aware that Linus felt this way, now I wonder if having sockptr_t spread will raise any red flags as this code flows back to him.
Hello Jakub,
On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote:
> On Mon, 4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> > become flexible enough to accept user or kernel pointers for optval/optlen.
>
> Have you seen:
>
> https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/
>
> ? I wasn't aware that Linus felt this way, now I wonder if having
> sockptr_t spread will raise any red flags as this code flows back
> to him.
Thanks for the heads-up. I've been thinking about it for a while and I'd
like to hear what are the next steps here.
Let me first back up and state where we are, and what is the current
situation:
1) __sys_getsockopt() uses __user pointers for both optval and optlen
2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
sqe, which is a kernel pointer/value.
Thus, we need to make the common code (callbacks) able to handle __user
and kernel pointers (for optlen, at least).
From a proto_ops callback perspective, ->setsockopt() uses sockptr.
int (*setsockopt)(struct socket *sock, int level,
int optname, sockptr_t optval,
unsigned int optlen);
Getsockopt() uses sockptr() for level=SOL_SOCKET:
int sk_getsockopt(struct sock *sk, int level, int optname,
sockptr_t optval, sockptr_t optlen)
But not for the other levels:
int (*getsockopt)(struct socket *sock, int level,
int optname, char __user *optval, int __user *optlen);
That said, if this patchset shouldn't use sockptr anymore, what is the
recommendation?
If we move this patchset to use iov_iter instead of sockptr, then I
understand we want to move *all* these callbacks to use iov_vec. Is this
the right direction?
Thanks for the guidance!
[1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/
On Fri, Oct 6, 2023 at 10:45 AM Breno Leitao <leitao@debian.org> wrote:
>
> Hello Jakub,
>
> On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote:
> > On Mon, 4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> > > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> > > become flexible enough to accept user or kernel pointers for optval/optlen.
> >
> > Have you seen:
> >
> > https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/
> >
> > ? I wasn't aware that Linus felt this way, now I wonder if having
> > sockptr_t spread will raise any red flags as this code flows back
> > to him.
>
> Thanks for the heads-up. I've been thinking about it for a while and I'd
> like to hear what are the next steps here.
>
> Let me first back up and state where we are, and what is the current
> situation:
>
> 1) __sys_getsockopt() uses __user pointers for both optval and optlen
> 2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
> sqe, which is a kernel pointer/value.
>
> Thus, we need to make the common code (callbacks) able to handle __user
> and kernel pointers (for optlen, at least).
>
> From a proto_ops callback perspective, ->setsockopt() uses sockptr.
>
> int (*setsockopt)(struct socket *sock, int level,
> int optname, sockptr_t optval,
> unsigned int optlen);
>
> Getsockopt() uses sockptr() for level=SOL_SOCKET:
>
> int sk_getsockopt(struct sock *sk, int level, int optname,
> sockptr_t optval, sockptr_t optlen)
>
> But not for the other levels:
>
> int (*getsockopt)(struct socket *sock, int level,
> int optname, char __user *optval, int __user *optlen);
>
>
> That said, if this patchset shouldn't use sockptr anymore, what is the
> recommendation?
>
> If we move this patchset to use iov_iter instead of sockptr, then I
> understand we want to move *all* these callbacks to use iov_vec. Is this
> the right direction?
>
> Thanks for the guidance!
>
> [1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/
Since sockptr_t is already used by __sys_setsockopt and
__sys_setsockopt, patches 1 and 2 don't introduce any new sockptr code
paths.
setsockopt callbacks also already use sockptr as of commit
a7b75c5a8c41 ("net: pass a sockptr_t into ->setsockopt").
getsockopt callbacks do take user pointers, just not sockptr.
Is the only issue right now the optlen kernel pointer?
On Mon, Oct 09, 2023 at 03:11:05AM -0700, Willem de Bruijn wrote:
> On Fri, Oct 6, 2023 at 10:45 AM Breno Leitao <leitao@debian.org> wrote:
> > Let me first back up and state where we are, and what is the current
> > situation:
> >
> > 1) __sys_getsockopt() uses __user pointers for both optval and optlen
> > 2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
> > sqe, which is a kernel pointer/value.
> >
> > Thus, we need to make the common code (callbacks) able to handle __user
> > and kernel pointers (for optlen, at least).
> >
> > From a proto_ops callback perspective, ->setsockopt() uses sockptr.
> >
> > int (*setsockopt)(struct socket *sock, int level,
> > int optname, sockptr_t optval,
> > unsigned int optlen);
> >
> > Getsockopt() uses sockptr() for level=SOL_SOCKET:
> >
> > int sk_getsockopt(struct sock *sk, int level, int optname,
> > sockptr_t optval, sockptr_t optlen)
> >
> > But not for the other levels:
> >
> > int (*getsockopt)(struct socket *sock, int level,
> > int optname, char __user *optval, int __user *optlen);
> >
> >
> > That said, if this patchset shouldn't use sockptr anymore, what is the
> > recommendation?
> >
> > If we move this patchset to use iov_iter instead of sockptr, then I
> > understand we want to move *all* these callbacks to use iov_vec. Is this
> > the right direction?
> >
> > Thanks for the guidance!
> >
> > [1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/
>
> Since sockptr_t is already used by __sys_setsockopt and
> __sys_setsockopt, patches 1 and 2 don't introduce any new sockptr code
> paths.
>
> setsockopt callbacks also already use sockptr as of commit
> a7b75c5a8c41 ("net: pass a sockptr_t into ->setsockopt").
>
> getsockopt callbacks do take user pointers, just not sockptr.
>
> Is the only issue right now the optlen kernel pointer?
Correct. The current discussion is only related to optlen in the
getsockopt() callbacks (invoked when level != SOL_SOCKET). Everything
else (getsockopt(level=SOL_SOCKET..) and setsockopt) is using sockptr.
Is it bad if we review/merge this code as is (using sockptr), and start
the iov_iter/getsockopt() refactor in a follow-up thread?
Thanks!
On Mon, 9 Oct 2023 06:28:00 -0700 Breno Leitao wrote: > Correct. The current discussion is only related to optlen in the > getsockopt() callbacks (invoked when level != SOL_SOCKET). Everything > else (getsockopt(level=SOL_SOCKET..) and setsockopt) is using sockptr. > > Is it bad if we review/merge this code as is (using sockptr), and start > the iov_iter/getsockopt() refactor in a follow-up thread? Sorry for the delay, I only looked at the code now :S Agreed, that there's no need to worry about the sockptr spread in this series. It looks good to go in.
On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote: > On Mon, 4 Sep 2023 09:24:53 -0700 Breno Leitao wrote: > > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions > > become flexible enough to accept user or kernel pointers for optval/optlen. > > Have you seen: > > https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/ I haven't but I think it will not affect *much* this patchset. > ? I wasn't aware that Linus felt this way, now I wonder if having > sockptr_t spread will raise any red flags as this code flows back > to him. I can change the io_uring API in a way that we can avoid these sockptr_t changes completely. My plan is to mimic what getsockopt(2) is doing in io_uring cmd path, in regard to optlen being an userpointer, instead of a value - which is then translated to a KERNEL_SOCKPTR. In this way, this change don't need to touch any sockptr field. Thanks for the heads-up
© 2016 - 2026 Red Hat, Inc.