net/mptcp/protocol.c | 6 +- net/mptcp/subflow.c | 8 + .../testing/selftests/bpf/prog_tests/mptcp.c | 150 ++++++++++++++++++ .../selftests/bpf/progs/mptcp_sockmap.c | 43 +++++ 4 files changed, 205 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/mptcp_sockmap.c
Overall, we encountered a warning [1] that can be triggered by running the
selftest I provided.
sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and
implementing fast socket-level forwarding logic:
1. Users can obtain file descriptors through userspace socket()/accept()
interfaces, then call BPF syscall to perform these replacements.
2. Users can also use the bpf_sock_hash_update helper (in sockops programs)
to replace handlers when TCP connections enter ESTABLISHED state
(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB)
However, when combined with MPTCP, an issue arises: MPTCP creates subflow
sk's and performs TCP handshakes, so the BPF program obtains subflow sk's
and may incorrectly replace their sk_prot. We need to reject such
operations. In patch 1, we set psock_update_sk_prot to NULL in the
subflow's custom sk_prot.
Additionally, if the server's listening socket has MPTCP enabled and the
client's TCP also uses MPTCP, we should allow the combination of subflow
and sockmap. This is because the latest Golang programs have enabled MPTCP
for listening sockets by default [2]. For programs already using sockmap,
upgrading Golang should not cause sockmap functionality to fail.
Patch 2 prevents the WARNING from occurring.
[1] truncated warning:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 388 at net/mptcp/protocol.c:68 \
mptcp_stream_accept+0x34c/0x380
Modules linked in:
RIP: 0010:mptcp_stream_accept+0x34c/0x380
RSP: 0018:ffffc90000cf3cf8 EFLAGS: 00010202
PKRU: 55555554
Call Trace:
<TASK>
do_accept+0xeb/0x190
? __x64_sys_pselect6+0x61/0x80
? _raw_spin_unlock+0x12/0x30
? alloc_fd+0x11e/0x190
__sys_accept4+0x8c/0x100
__x64_sys_accept+0x1f/0x30
x64_sys_call+0x202f/0x20f0
do_syscall_64+0x72/0x9a0
? switch_fpu_return+0x60/0xf0
? irqentry_exit_to_user_mode+0xdb/0x1e0
? irqentry_exit+0x3f/0x50
? clear_bhb_loop+0x50/0xa0
? clear_bhb_loop+0x50/0xa0
? clear_bhb_loop+0x50/0xa0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
---[ end trace 0000000000000000 ]---
[2]: https://go-review.googlesource.com/c/go/+/607715
---
v3 -> v4: Addressed questions from Matthieu and Paolo, explained sockmap's
operational mechanism, and finalized the changes
v2 -> v3: Adopted Jakub Sitnicki's suggestions - atomic retrieval of
sk_family is required
v1 -> v2: Had initial discussion with Matthieu on sockmap and MPTCP
technical details
v3: https://lore.kernel.org/bpf/20251023125450.105859-1-jiayuan.chen@linux.dev/
v2: https://lore.kernel.org/bpf/20251020060503.325369-1-jiayuan.chen@linux.dev/T/#t
v1: https://lore.kernel.org/mptcp/a0a2b87119a06c5ffaa51427a0964a05534fe6f1@linux.dev/T/#t
Jiayuan Chen (3):
mptcp: disallow MPTCP subflows from sockmap
net,mptcp: fix proto fallback detection with BPF
selftests/bpf: Add mptcp test with sockmap
net/mptcp/protocol.c | 6 +-
net/mptcp/subflow.c | 8 +
.../testing/selftests/bpf/prog_tests/mptcp.c | 150 ++++++++++++++++++
.../selftests/bpf/progs/mptcp_sockmap.c | 43 +++++
4 files changed, 205 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcp_sockmap.c
base-commit: 89aec171d9d1ab168e43fcf9754b82e4c0aef9b9
--
2.43.0
Hi Jiayuan, On 05/11/2025 12:36, Jiayuan Chen wrote: > Overall, we encountered a warning [1] that can be triggered by running the > selftest I provided. Thank you for the v4! > sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and > implementing fast socket-level forwarding logic: > 1. Users can obtain file descriptors through userspace socket()/accept() > interfaces, then call BPF syscall to perform these replacements. > 2. Users can also use the bpf_sock_hash_update helper (in sockops programs) > to replace handlers when TCP connections enter ESTABLISHED state > (BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB) > > However, when combined with MPTCP, an issue arises: MPTCP creates subflow > sk's and performs TCP handshakes, so the BPF program obtains subflow sk's > and may incorrectly replace their sk_prot. We need to reject such > operations. In patch 1, we set psock_update_sk_prot to NULL in the > subflow's custom sk_prot. This new version looks good to me. I have some small comments on patches 1 and 2 that can only be addressed if a v5 is needed I think. I have some questions for the 3rd patch. It would be good if someone else with more experience with the BPF selftests can also look at it. > Additionally, if the server's listening socket has MPTCP enabled and the > client's TCP also uses MPTCP, we should allow the combination of subflow > and sockmap. This is because the latest Golang programs have enabled MPTCP > for listening sockets by default [2]. For programs already using sockmap, > upgrading Golang should not cause sockmap functionality to fail. Note: even if these patches here are needed to avoid stream corruption and other issues, in your specific case with sockmap, I think it would be better to set this env var until MPTCP support is added to sockmap: GODEBUG=multipathtcp=0 Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi Jiayuan,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19103762226
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/bd1999ff8048
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1019848
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Hi Jiayuan,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Script error! ❓
- KVM Validation: btf-debug (only bpftest_all): Script error! ❓
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19101579816
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/c93a653cfd56
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1019848
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Hi Jiayuan, On 05/11/2025 14:03, MPTCP CI wrote: > Hi Jiayuan, > > Thank you for your modifications, that's great! > > Our CI did some validations and here is its report: > > - KVM Validation: normal (except selftest_mptcp_join): Success! ✅ > - KVM Validation: normal (only selftest_mptcp_join): Success! ✅ > - KVM Validation: debug (except selftest_mptcp_join): Success! ✅ > - KVM Validation: debug (only selftest_mptcp_join): Success! ✅ > - KVM Validation: btf-normal (only bpftest_all): Script error! ❓ > - KVM Validation: btf-debug (only bpftest_all): Script error! ❓ Please ignore: it looks like it is due to conflicts with WIP code from our MPTCP tree. Your branch is one top of 'net' from 'netdev' and our CI applied it on top of 'export' from 'mptcp'. I will check to add a workaround. Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2025 Red Hat, Inc.