[PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket

David Carlier posted 4 patches 1 week ago
Failed in applying to current master (apply log)
There is a newer version of this series
net/mptcp/protocol.c                          |  33 +++++-
net/mptcp/sockopt.c                           | 107 ++++++++++++++----
.../selftests/net/mptcp/mptcp_sockopt.c       |  55 +++++++++
3 files changed, 170 insertions(+), 25 deletions(-)
[PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by David Carlier 1 week ago
MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
parent socket does not currently provide usable MSG_ERRQUEUE handling.

This series wires the MPTCP socket up to the IPv4/IPv6 error queue
paths. It propagates RECVERR-related sockopts to existing and future
subflows, makes poll() report pending errqueue activity through the
parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
consume queued errors with the parent socket ABI.

A new prerequisite patch factors the per-flag inet_flags propagation
in sync_socket_options() into a single masked word copy, so further
inet_flags propagated by MPTCP can be added by extending the mask
rather than touching the call site.

Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
four RECVERR bits, dropping the family-specific helpers from v3.

Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>

v3 -> v4:
  - New patch 1/4: factor inet_flags propagation in
    sync_socket_options() through MPTCP_INET_FLAGS_MASK, per Paolo's
    review.
  - Patch 2/4 (was 1/3): drop the mptcp_recverr_enabled() and
    mptcp_subflow_set_recverr() helpers; route the setsockopt path
    through mptcp_setsockopt_all_sf(). Inherit the four RECVERR bits
    via MPTCP_INET_FLAGS_MASK in sync_socket_options() instead of
    explicit inet[6]_assign_bit() calls.
  - Patch 3/4 (was 2/3): rework the MSG_ERRQUEUE plumbing per Paolo's
    review. Subflow err skbs are now spliced onto the parent msk's
    sk_error_queue from __mptcp_subflow_error_report() via the new
    __mptcp_subflow_splice_errqueue() helper. recvmsg(MSG_ERRQUEUE)
    on the parent reverts to plain inet_recv_error(), and mptcp_poll()
    only inspects the parent's sk_error_queue -- no more on-demand
    subflow walks, no extra lock_sock() / data_lock() in the poll or
    recv paths. Keep the original early-return structure of
    __mptcp_subflow_error_report() and fix the reverse christmas-tree
    variable order Paolo flagged.

v2 -> v3:
  - Only consume ssk->sk_err in the fallback / MPC-connect branch of
    __mptcp_subflow_error_report(). Steady-state MPTCP now leaves
    TCP's one-shot sk_err to TCP's own consumer instead of silently
    draining it via sock_error().
  - In mptcp_recv_error(), also route to inet_recv_error() when
    sk->sk_err is set, so a fallback-propagated error reaches userspace
    even when the parent errqueue is empty.
  - Scope the new selftest to IP_RECVERR sockopt propagation only.
    End-to-end errqueue delivery (TX timestamps, ICMP, zerocopy)
    depends on subflow-side producers that are out of scope for this
    series and will be covered by follow-up work. Fixes the
    mptcp_sockopt selftest timeout reported by the MPTCP CI on v2.

v1 -> v2:
  - Retargeted to mptcp-next per Matthieu Baerts' feedback (net-next
    closed during the merge window; iterate on the MPTCP tree).
  - Guard mptcp_setsockopt_v6_recverr() and its dispatch cases in
    mptcp_setsockopt_v6() with #if IS_ENABLED(CONFIG_IPV6) to fix
    the MPTCP CI link break on without_ipv6/with_mptcp configs
    (undefined reference to ipv6_setsockopt).

v1: https://lore.kernel.org/mptcp/20260421152216.38127-1-devnexen@gmail.com/
v2: https://lore.kernel.org/mptcp/20260421191337.58341-1-devnexen@gmail.com/
v3: https://lore.kernel.org/mptcp/20260421223338.52743-1-devnexen@gmail.com/

David Carlier (4):
  mptcp: sockopt: factor inet_flags propagation into a mask
  mptcp: propagate RECVERR sockopts to subflows
  mptcp: support MSG_ERRQUEUE on the parent socket
  selftests: mptcp: cover IP_RECVERR sockopt propagation

 net/mptcp/protocol.c                          |  33 +++++-
 net/mptcp/sockopt.c                           | 107 ++++++++++++++----
 .../selftests/net/mptcp/mptcp_sockopt.c       |  55 +++++++++
 3 files changed, 170 insertions(+), 25 deletions(-)

--
2.53.0
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by MPTCP CI 6 days, 14 hours ago
Hi David,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️ 
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_fastopen packetdrill_sockopts ⚠️ 
- KVM Validation: debug (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️ 
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25071789731

Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/7688d292b14a
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1086438


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by Matthieu Baerts 6 days, 15 hours ago
Hi David,

Thank you for the new version.

On 27/04/2026 23:10, David Carlier wrote:
> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> parent socket does not currently provide usable MSG_ERRQUEUE handling.
> 
> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> paths. It propagates RECVERR-related sockopts to existing and future
> subflows, makes poll() report pending errqueue activity through the
> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
> consume queued errors with the parent socket ABI.
> 
> A new prerequisite patch factors the per-flag inet_flags propagation
> in sync_socket_options() into a single masked word copy, so further
> inet_flags propagated by MPTCP can be added by extending the mask
> rather than touching the call site.
> 
> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
> four RECVERR bits, dropping the family-specific helpers from v3.
> 
> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>

I didn't review it, but I notice that the CI cannot apply your series,
because it looks like it is not based on the one you mentioned here.

Can you either remove this line, or rebase your series on top of this
other patch?

Also, please don't send your series as a reply to a previous posting,
please use a new thread. That's what is usually done, clearer, plus some
tools don't support replies.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by Matthieu Baerts 6 days, 15 hours ago
On 28/04/2026 20:48, Matthieu Baerts wrote:
> Hi David,
> 
> Thank you for the new version.
> 
> On 27/04/2026 23:10, David Carlier wrote:
>> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
>> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>>
>> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
>> paths. It propagates RECVERR-related sockopts to existing and future
>> subflows, makes poll() report pending errqueue activity through the
>> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
>> consume queued errors with the parent socket ABI.
>>
>> A new prerequisite patch factors the per-flag inet_flags propagation
>> in sync_socket_options() into a single masked word copy, so further
>> inet_flags propagated by MPTCP can be added by extending the mask
>> rather than touching the call site.
>>
>> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
>> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
>> four RECVERR bits, dropping the family-specific helpers from v3.
>>
>> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
> 
> I didn't review it, but I notice that the CI cannot apply your series,
> because it looks like it is not based on the one you mentioned here.
> 
> Can you either remove this line, or rebase your series on top of this
> other patch?
> 
> Also, please don't send your series as a reply to a previous posting,
> please use a new thread. That's what is usually done, clearer, plus some
> tools don't support replies.

Note: I just manually resolved the conflicts and sent the series to the
CI, not to have to resend a series just to retrigger the CI.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by Matthieu Baerts 3 days, 19 hours ago
Hi David,

On 28/04/2026 20:56, Matthieu Baerts wrote:
> On 28/04/2026 20:48, Matthieu Baerts wrote:
>> Hi David,
>>
>> Thank you for the new version.
>>
>> On 27/04/2026 23:10, David Carlier wrote:
>>> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
>>> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>>>
>>> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
>>> paths. It propagates RECVERR-related sockopts to existing and future
>>> subflows, makes poll() report pending errqueue activity through the
>>> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
>>> consume queued errors with the parent socket ABI.
>>>
>>> A new prerequisite patch factors the per-flag inet_flags propagation
>>> in sync_socket_options() into a single masked word copy, so further
>>> inet_flags propagated by MPTCP can be added by extending the mask
>>> rather than touching the call site.
>>>
>>> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
>>> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
>>> four RECVERR bits, dropping the family-specific helpers from v3.
>>>
>>> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
>>
>> I didn't review it, but I notice that the CI cannot apply your series,
>> because it looks like it is not based on the one you mentioned here.
>>
>> Can you either remove this line, or rebase your series on top of this
>> other patch?
>>
>> Also, please don't send your series as a reply to a previous posting,
>> please use a new thread. That's what is usually done, clearer, plus some
>> tools don't support replies.
> 
> Note: I just manually resolved the conflicts and sent the series to the
> CI, not to have to resend a series just to retrigger the CI.

It looks like the CI (and sashiko) found some issues with this series.

But globally, I'm a bit puzzled: with MPTCP, there might be multiple
paths being used, and reporting errors about all of them when the
"legacy" RECVERR socket options are used will confuse the userspace that
doesn't (have to) know multiple subflows are being used. In this case,
either messages should be filtered (might be hard to handle all
use-cases and maintain that?), or this should be limited to cases where
only one subflow is being used. Which leads me to this question: what's
your use-case exactly? What are you trying to solve?

It might be easier to have a dedicated MPTCP_RECERR, and eventually
propagate more MPTCP-specific messages. Something that could be linked to:

  https://github.com/multipath-tcp/mptcp_net-next/issues/78

WDYT?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by David CARLIER 3 days, 18 hours ago
Hi Matthieu,

  On 01/05/2026 16:49, Matthieu Baerts wrote:
  > It looks like the CI (and sashiko) found some issues with this series.

  For v5:

  - 1/4: per-bit inet_assign_bit() loop instead of WRITE_ONCE(), keeps
    atomicity.
  - 2/4: add missing sockopt_seq_inc(msk).
  - 2/4: skip family-mismatched subflows in the v4/v6 helpers.
  - 2/4: snapshot optval to a local int, pass KERNEL_SOCKPTR(&val) into
    the loop.
  - 3/4: pull-on-drain from mptcp_recv_error() so a parent-ENOMEM does
    not strand subflow skbs.

  Will also re-run the docker repro to check the selftest_mptcp_join /
  packetdrill rows are pre-existing.

  > But globally, I'm a bit puzzled: with MPTCP, there might be multiple
  > paths being used, and reporting errors about all of them when the
  > "legacy" RECVERR socket options are used will confuse the userspace
  > that doesn't (have to) know multiple subflows are being used.

  Fair, and Paolo raised it on v3. The use-case is tx timestamping and
  MSG_ZEROCOPY completions - both are tied to user data, not the
  subflow that carried it, so no subflow identity leaks into the cmsg.
  ICMP/ICMPv6 is the part that does. v5 will filter the splice by
  SO_EE_ORIGIN: forward TIMESTAMPING / ZEROCOPY / LOCAL, drop ICMP.

  > It might be easier to have a dedicated MPTCP_RECERR, and
eventually
  > propagate more MPTCP-specific messages. Something that could be
  > linked to:
  >   https://github.com/multipath-tcp/mptcp_net-next/issues/78

  Agreed - subflow ICMP and #78's lifecycle events belong there. As a
  follow-up once v5 lands.

  Cheers
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by Matthieu Baerts 3 days, 18 hours ago
On 01/05/2026 17:28, David CARLIER wrote:
> Hi Matthieu,
> 
>   On 01/05/2026 16:49, Matthieu Baerts wrote:
>   > It looks like the CI (and sashiko) found some issues with this series.

(Please do fix your email client to avoid this formatting: some of your
emails are OK, but not all of them.)

>   For v5:
> 
>   - 1/4: per-bit inet_assign_bit() loop instead of WRITE_ONCE(), keeps
>     atomicity.
>   - 2/4: add missing sockopt_seq_inc(msk).
>   - 2/4: skip family-mismatched subflows in the v4/v6 helpers.
>   - 2/4: snapshot optval to a local int, pass KERNEL_SOCKPTR(&val) into
>     the loop.

(While at it, your new helpers mptcp_setsockopt_v[46]_recverr could have
a generic name)

>   - 3/4: pull-on-drain from mptcp_recv_error() so a parent-ENOMEM does
>     not strand subflow skbs.
> 
>   Will also re-run the docker repro to check the selftest_mptcp_join /
>   packetdrill rows are pre-existing.

The packetdrill errors might be pre-existing, someone should look at
improving the situation there:

  https://ci-results.mptcp.dev/flakes.html

>   > But globally, I'm a bit puzzled: with MPTCP, there might be multiple
>   > paths being used, and reporting errors about all of them when the
>   > "legacy" RECVERR socket options are used will confuse the userspace
>   > that doesn't (have to) know multiple subflows are being used.
> 
>   Fair, and Paolo raised it on v3. The use-case is tx timestamping and
>   MSG_ZEROCOPY completions - both are tied to user data, not the
>   subflow that carried it, so no subflow identity leaks into the cmsg.
>   ICMP/ICMPv6 is the part that does. v5 will filter the splice by
>   SO_EE_ORIGIN: forward TIMESTAMPING / ZEROCOPY / LOCAL, drop ICMP.

Maybe OK with this filter indeed..

>   > It might be easier to have a dedicated MPTCP_RECERR, and
> eventually
>   > propagate more MPTCP-specific messages. Something that could be
>   > linked to:
>   >   https://github.com/multipath-tcp/mptcp_net-next/issues/78
> 
>   Agreed - subflow ICMP and #78's lifecycle events belong there. As a
>   follow-up once v5 lands.

Indeed, better to split them.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
Posted by David CARLIER 6 days, 14 hours ago
Hi Mathieu,

On Tue, 28 Apr 2026 at 19:56, Matthieu Baerts <matttbe@kernel.org> wrote:
>
> On 28/04/2026 20:48, Matthieu Baerts wrote:
> > Hi David,
> >
> > Thank you for the new version.
> >
> > On 27/04/2026 23:10, David Carlier wrote:
> >> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> >> parent socket does not currently provide usable MSG_ERRQUEUE handling.
> >>
> >> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> >> paths. It propagates RECVERR-related sockopts to existing and future
> >> subflows, makes poll() report pending errqueue activity through the
> >> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
> >> consume queued errors with the parent socket ABI.
> >>
> >> A new prerequisite patch factors the per-flag inet_flags propagation
> >> in sync_socket_options() into a single masked word copy, so further
> >> inet_flags propagated by MPTCP can be added by extending the mask
> >> rather than touching the call site.
> >>
> >> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
> >> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
> >> four RECVERR bits, dropping the family-specific helpers from v3.
> >>
> >> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
> >
> > I didn't review it, but I notice that the CI cannot apply your series,
> > because it looks like it is not based on the one you mentioned here.
> >
> > Can you either remove this line, or rebase your series on top of this
> > other patch?
> >
> > Also, please don't send your series as a reply to a previous posting,
> > please use a new thread. That's what is usually done, clearer, plus some
> > tools don't support replies.
>
> Note: I just manually resolved the conflicts and sent the series to the
> CI, not to have to resend a series just to retrigger the CI.

appreciated. Cheers.
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>