net/mptcp/protocol.c | 92 ++++++++++- net/mptcp/sockopt.c | 146 ++++++++++++++---- .../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++ 3 files changed, 261 insertions(+), 32 deletions(-)
This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY completion notifications through the standard inet ABI. IP_RECVERR / IPV6_RECVERR (and their RFC4884 variants) are propagated to existing and future subflows. Patch 1 factors per-flag inet_assign_bit() calls in sync_socket_options() into a mask-driven loop so future propagated flags only need to extend MPTCP_INET_FLAGS_MASK. Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value, apply it on the parent, and forward to every subflow under lock_sock() so concurrent setsockopt callers cannot leave parent and subflows desynchronized. Newly-joining subflows pick up the four RECVERR bits through sync_socket_options(). Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL) from each subflow's error queue onto the parent's, so pollers see EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow ICMP errors are dropped — they will be carried by a future MPTCP_RECERR channel. Patch 4 covers IP_RECVERR / IPV6_RECVERR propagation and the empty- errqueue EAGAIN contract on MSG_ERRQUEUE | MSG_DONTWAIT in selftest. v6 -> v7: - patch 2: gate SOL_IPV6 setsockopt/getsockopt dispatch on sk_family == AF_INET6, returning -ENOPROTOOPT otherwise, mirroring plain TCP. Addresses the sashiko Medium finding on v6 where IPV6_RECVERR silently succeeded on AF_INET MPTCP sockets. - patch 3: track moved skbs in mptcp_recv_error() and retry inet_recv_error() when ret == -EAGAIN && moved, so a successful subflow splice is not masked by the initial drain returning EAGAIN (sashiko High #2 on v6). - patch 3: add mptcp_subflow_errqueue_pending() and OR it into the EPOLLERR check in mptcp_poll(), so events stranded on a subflow when the parent is under rmem pressure still wake userspace (sashiko High #1 on v6). - rebased on current export. Tested with KVM-validation auto-normal: 25/25 pass. David Carlier (4): mptcp: sockopt: factor inet_flags propagation into a mask mptcp: propagate RECVERR sockopts to subflows mptcp: support MSG_ERRQUEUE on the parent socket selftests: mptcp: cover IP_RECVERR sockopt propagation net/mptcp/protocol.c | 92 ++++++++++- net/mptcp/sockopt.c | 146 ++++++++++++++---- .../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++ 3 files changed, 261 insertions(+), 32 deletions(-) base-commit: 63b133728231ebba5167bd1e53dda9bcf0bee7c7 -- 2.53.0
Hi David, On 10/05/2026 07:16, David Carlier wrote: > This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so > poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY > completion notifications through the standard inet ABI. IP_RECVERR / > IPV6_RECVERR (and their RFC4884 variants) are propagated to existing > and future subflows. > > Patch 1 factors per-flag inet_assign_bit() calls in > sync_socket_options() into a mask-driven loop so future propagated > flags only need to extend MPTCP_INET_FLAGS_MASK. > > Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value, > apply it on the parent, and forward to every subflow under lock_sock() > so concurrent setsockopt callers cannot leave parent and subflows > desynchronized. Newly-joining subflows pick up the four RECVERR bits > through sync_socket_options(). > > Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL) > from each subflow's error queue onto the parent's, so pollers see > EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow > ICMP errors are dropped — they will be carried by a future > MPTCP_RECERR channel. Sorry for the delay: I saw Sashiko had some comments [1], and because I noticed you checked it before, I thought you were going to send a reply or a new version, and I forgot to ask here. So here it is: is the review correct? [1] https://sashiko.dev/#/patchset/20260509211651.104934-1-devnexen@gmail.com Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi, On Wed, 27 May 2026 at 06:08, Matthieu Baerts <matttbe@kernel.org> wrote: > > Hi David, > > On 10/05/2026 07:16, David Carlier wrote: > > This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so > > poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY > > completion notifications through the standard inet ABI. IP_RECVERR / > > IPV6_RECVERR (and their RFC4884 variants) are propagated to existing > > and future subflows. > > > > Patch 1 factors per-flag inet_assign_bit() calls in > > sync_socket_options() into a mask-driven loop so future propagated > > flags only need to extend MPTCP_INET_FLAGS_MASK. > > > > Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value, > > apply it on the parent, and forward to every subflow under lock_sock() > > so concurrent setsockopt callers cannot leave parent and subflows > > desynchronized. Newly-joining subflows pick up the four RECVERR bits > > through sync_socket_options(). > > > > Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL) > > from each subflow's error queue onto the parent's, so pollers see > > EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow > > ICMP errors are dropped — they will be carried by a future > > MPTCP_RECERR channel. > > Sorry for the delay: I saw Sashiko had some comments [1], and because I > noticed you checked it before, I thought you were going to send a reply > or a new version, and I forgot to ask here. So here it is: is the review > correct? > > [1] > https://sashiko.dev/#/patchset/20260509211651.104934-1-devnexen@gmail.com > > Cheers, > Matt > -- > Sponsored by the NGI0 Core fund. > Yes, both findings are real. For v8 I'll drop the skb on splice failure (matches sock_queue_err_skb()'s own behaviour under rmem pressure: -ENOMEM + sk_drops++, the skb is freed by the caller). With nothing retained on subflow err queues, mptcp_subflow_errqueue_pending() can go from mptcp_poll() — which fixes the lockless conn_list walk too — and the recvmsg retry in mptcp_recv_error() goes with it. Cheers
Hi David,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25612442092
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0f646cd55809
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1092123
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2026 Red Hat, Inc.