net/mptcp/protocol.c | 6 +- net/mptcp/subflow.c | 8 + .../testing/selftests/bpf/prog_tests/mptcp.c | 141 ++++++++++++++++++ .../selftests/bpf/progs/mptcp_sockmap.c | 43 ++++++ 4 files changed, 196 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/mptcp_sockmap.c
Overall, we encountered a warning [1] that can be triggered by running the
selftest I provided.
sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and
implementing fast socket-level forwarding logic:
1. Users can obtain file descriptors through userspace socket()/accept()
interfaces, then call BPF syscall to perform these replacements.
2. Users can also use the bpf_sock_hash_update helper (in sockops programs)
to replace handlers when TCP connections enter ESTABLISHED state
(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB)
However, when combined with MPTCP, an issue arises: MPTCP creates subflow
sk's and performs TCP handshakes, so the BPF program obtains subflow sk's
and may incorrectly replace their sk_prot. We need to reject such
operations. In patch 1, we set psock_update_sk_prot to NULL in the
subflow's custom sk_prot.
Additionally, if the server's listening socket has MPTCP enabled and the
client's TCP also uses MPTCP, we should allow the combination of subflow
and sockmap. This is because the latest Golang programs have enabled MPTCP
for listening sockets by default [2]. For programs already using sockmap,
upgrading Golang should not cause sockmap functionality to fail.
Patch 2 prevents the WARNING from occurring.
Despite these patches fixing stream corruption, users of sockmap must set
GODEBUG=multipathtcp=0 to disable MPTCP until sockmap fully supports it.
[1] truncated warning:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 388 at net/mptcp/protocol.c:68 mptcp_stream_accept+0x34c/0x380
Modules linked in:
RIP: 0010:mptcp_stream_accept+0x34c/0x380
RSP: 0018:ffffc90000cf3cf8 EFLAGS: 00010202
PKRU: 55555554
Call Trace:
<TASK>
do_accept+0xeb/0x190
? __x64_sys_pselect6+0x61/0x80
? _raw_spin_unlock+0x12/0x30
? alloc_fd+0x11e/0x190
__sys_accept4+0x8c/0x100
__x64_sys_accept+0x1f/0x30
x64_sys_call+0x202f/0x20f0
do_syscall_64+0x72/0x9a0
? switch_fpu_return+0x60/0xf0
? irqentry_exit_to_user_mode+0xdb/0x1e0
? irqentry_exit+0x3f/0x50
? clear_bhb_loop+0x50/0xa0
? clear_bhb_loop+0x50/0xa0
? clear_bhb_loop+0x50/0xa0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
---[ end trace 0000000000000000 ]---
[2]: https://go-review.googlesource.com/c/go/+/607715
---
v4 -> v5: Dropped redundant selftest code, updated the Fixes tag, and
added a Reviewed-by tag.
v3 -> v4: Addressed questions from Matthieu and Paolo, explained sockmap's
operational mechanism, and finalized the changes
v2 -> v3: Adopted Jakub Sitnicki's suggestions - atomic retrieval of
sk_family is required
v1 -> v2: Had initial discussion with Matthieu on sockmap and MPTCP
technical details
v4: https://lore.kernel.org/bpf/20251105113625.148900-1-jiayuan.chen@linux.dev/
v3: https://lore.kernel.org/bpf/20251023125450.105859-1-jiayuan.chen@linux.dev/
v2: https://lore.kernel.org/bpf/20251020060503.325369-1-jiayuan.chen@linux.dev/T/#t
v1: https://lore.kernel.org/mptcp/a0a2b87119a06c5ffaa51427a0964a05534fe6f1@linux.dev/T/#t
Jiayuan Chen (3):
mptcp: disallow MPTCP subflows from sockmap
net,mptcp: fix proto fallback detection with BPF
selftests/bpf: Add mptcp test with sockmap
net/mptcp/protocol.c | 6 +-
net/mptcp/subflow.c | 8 +
.../testing/selftests/bpf/prog_tests/mptcp.c | 141 ++++++++++++++++++
.../selftests/bpf/progs/mptcp_sockmap.c | 43 ++++++
4 files changed, 196 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcp_sockmap.c
base-commit: 8c0726e861f3920bac958d76cf134b5a3aa14ce4
--
2.43.0
Hello:
This series was applied to bpf/bpf.git (master)
by Martin KaFai Lau <martin.lau@kernel.org>:
On Tue, 11 Nov 2025 14:02:49 +0800 you wrote:
> Overall, we encountered a warning [1] that can be triggered by running the
> selftest I provided.
>
> sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and
> implementing fast socket-level forwarding logic:
> 1. Users can obtain file descriptors through userspace socket()/accept()
> interfaces, then call BPF syscall to perform these replacements.
> 2. Users can also use the bpf_sock_hash_update helper (in sockops programs)
> to replace handlers when TCP connections enter ESTABLISHED state
> (BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB)
>
> [...]
Here is the summary with links:
- [net,v5,1/3] mptcp: disallow MPTCP subflows from sockmap
https://git.kernel.org/bpf/bpf/c/fbade4bd08ba
- [net,v5,2/3] net,mptcp: fix proto fallback detection with BPF
https://git.kernel.org/bpf/bpf/c/c77b3b79a92e
- [net,v5,3/3] selftests/bpf: Add mptcp test with sockmap
https://git.kernel.org/bpf/bpf/c/cb730e4ac1b4
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Hi net and bpf-net maintainers, On 11/11/2025 07:02, Jiayuan Chen wrote: > Overall, we encountered a warning [1] that can be triggered by running the > selftest I provided. > > sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and > implementing fast socket-level forwarding logic: > 1. Users can obtain file descriptors through userspace socket()/accept() > interfaces, then call BPF syscall to perform these replacements. > 2. Users can also use the bpf_sock_hash_update helper (in sockops programs) > to replace handlers when TCP connections enter ESTABLISHED state > (BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB) > > However, when combined with MPTCP, an issue arises: MPTCP creates subflow > sk's and performs TCP handshakes, so the BPF program obtains subflow sk's > and may incorrectly replace their sk_prot. We need to reject such > operations. In patch 1, we set psock_update_sk_prot to NULL in the > subflow's custom sk_prot. > > Additionally, if the server's listening socket has MPTCP enabled and the > client's TCP also uses MPTCP, we should allow the combination of subflow > and sockmap. This is because the latest Golang programs have enabled MPTCP > for listening sockets by default [2]. For programs already using sockmap, > upgrading Golang should not cause sockmap functionality to fail. > > Patch 2 prevents the WARNING from occurring. I think this series can be applied directly in 'net', if that's OK for both of you. Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Tue, 11 Nov 2025 11:35:04 +0100 Matthieu Baerts wrote: > I think this series can be applied directly in 'net', if that's OK for > both of you. Also no preference here, Martin mentioned he will take it via bpf tomorrow. Please let us know on the off chance that you have anything that may conflict queued up. These will likely need a week of travel before they reach net in this case.
Hi Jakub, On 13/11/2025 03:23, Jakub Kicinski wrote: > On Tue, 11 Nov 2025 11:35:04 +0100 Matthieu Baerts wrote: >> I think this series can be applied directly in 'net', if that's OK for >> both of you. > > Also no preference here, Martin mentioned he will take it via bpf > tomorrow. > > Please let us know on the off chance that you have anything that may > conflict queued up. These will likely need a week of travel before > they reach net in this case. No problem for me, this can go to bpf-net first. We don't have pending patches modifying these parts. Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2025 Red Hat, Inc.