Recently, I found a network having weird behaviours with MPTCP packets: - The first connection to a server had a successful 3WHS, then MPTCP options got stripped off. - The next one had the first SYN (with or without MPTCP) and 5 retransmissions dropped, before being apparently intercepted and proxied to the end server. - (The next ones were sometimes intercepted, sometimes not, or dropped at the beginning. I'm trying to find out which kind of "optimiser" is causing this.) The result of this was a blackhole being "wrongly" detected, and no ways to force connections with quite a few SYN drops to finally use MPTCP at the end. In this series, we have: - A small fix for the doc. (applied) - A new sysctl to change the number of SYN retransmitted with MPTCP options before falling back to TCP. The modification looks simple enough to still be sent to netdev before the closure I think. (applied) - A fix to only turn on the blackhole protection only when the first SYN retransmitted without MPTCP option is accepted, instead of any after. The blackhole feature was supposed to do that from the beginning, but a check was wrongly placed. I think we should consider this as a fix, even if there are also risks of not detecting a blackhole if the first SYN retransmitted without MPTCP is dropped by accident. But that seems more unlikely for an "MPTCP firewall blackhole", and I guess not all future MPTCP connections will behave exactly like that. It sounds then safer to reduce the possibilities of enabling the blackhole protection by accident, and apply this patch. - A small cleanup to exit early. Please note that patches 1 and 2 have already been applied in our tree. So this version only has patches 3 and 4. Patch 3 is a fix, maybe not patch 4, but it might be easier to squash patch 3 and 4 and send it as a fix to -net. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Changes in v2: - Previous patches 1 & 2 have been applied. - Patch 1 (ex 3) has not been modified. - Patch 2 has been added, it can be squashed to the previous one. - Link to v1: https://lore.kernel.org/r/20250114-mpc-no-blackhole-v1-0-994bd2a357fb@kernel.org --- Matthieu Baerts (NGI0) (2): mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted mptcp: blackhole: avoid checking the state twice net/mptcp/ctrl.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) --- base-commit: a9a670c7f9f94ac66e6a79ecb81e4e7a6e20f001 change-id: 20250114-mpc-no-blackhole-526a61ea0334 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
On Wed, 22 Jan 2025, Matthieu Baerts (NGI0) wrote: > Recently, I found a network having weird behaviours with MPTCP packets: > > - The first connection to a server had a successful 3WHS, then MPTCP > options got stripped off. > > - The next one had the first SYN (with or without MPTCP) and 5 > retransmissions dropped, before being apparently intercepted and > proxied to the end server. > > - (The next ones were sometimes intercepted, sometimes not, or dropped > at the beginning. I'm trying to find out which kind of "optimiser" is > causing this.) > > The result of this was a blackhole being "wrongly" detected, and no ways > to force connections with quite a few SYN drops to finally use MPTCP at > the end. > > In this series, we have: > > - A small fix for the doc. (applied) > > - A new sysctl to change the number of SYN retransmitted with MPTCP > options before falling back to TCP. The modification looks simple > enough to still be sent to netdev before the closure I think. (applied) > > - A fix to only turn on the blackhole protection only when the first SYN > retransmitted without MPTCP option is accepted, instead of any after. > The blackhole feature was supposed to do that from the beginning, but > a check was wrongly placed. I think we should consider this as a fix, > even if there are also risks of not detecting a blackhole if the first > SYN retransmitted without MPTCP is dropped by accident. But that seems > more unlikely for an "MPTCP firewall blackhole", and I guess not all > future MPTCP connections will behave exactly like that. It sounds then > safer to reduce the possibilities of enabling the blackhole protection > by accident, and apply this patch. > > - A small cleanup to exit early. > > Please note that patches 1 and 2 have already been applied in our tree. > So this version only has patches 3 and 4. Patch 3 is a fix, maybe not > patch 4, but it might be easier to squash patch 3 and 4 and send it as a > fix to -net. > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > Changes in v2: > - Previous patches 1 & 2 have been applied. > - Patch 1 (ex 3) has not been modified. > - Patch 2 has been added, it can be squashed to the previous one. > - Link to v1: https://lore.kernel.org/r/20250114-mpc-no-blackhole-v1-0-994bd2a357fb@kernel.org > Hi Matthieu - Both patches in this series look good to me: Reviewed-by: Mat Martineau <martineau@kernel.org> I'd suggest either queuing patch 2 for -next, or dropping it. The optimizer probably makes it a no-op in terms of performance anyway, and now looking at the size of the patch (even though it's mostly due to re-indenting and moving a couple hunks of code around) it does seem like a moderate amount of churn. - Mat > --- > Matthieu Baerts (NGI0) (2): > mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted > mptcp: blackhole: avoid checking the state twice > > net/mptcp/ctrl.c | 30 +++++++++++++++++------------- > 1 file changed, 17 insertions(+), 13 deletions(-) > --- > base-commit: a9a670c7f9f94ac66e6a79ecb81e4e7a6e20f001 > change-id: 20250114-mpc-no-blackhole-526a61ea0334 > > Best regards, > -- > Matthieu Baerts (NGI0) <matttbe@kernel.org> > > >
Hi Mat, On 24/01/2025 02:42, Mat Martineau wrote: > On Wed, 22 Jan 2025, Matthieu Baerts (NGI0) wrote: > >> Recently, I found a network having weird behaviours with MPTCP packets: >> >> - The first connection to a server had a successful 3WHS, then MPTCP >> options got stripped off. >> >> - The next one had the first SYN (with or without MPTCP) and 5 >> retransmissions dropped, before being apparently intercepted and >> proxied to the end server. >> >> - (The next ones were sometimes intercepted, sometimes not, or dropped >> at the beginning. I'm trying to find out which kind of "optimiser" is >> causing this.) >> >> The result of this was a blackhole being "wrongly" detected, and no ways >> to force connections with quite a few SYN drops to finally use MPTCP at >> the end. >> >> In this series, we have: >> >> - A small fix for the doc. (applied) >> >> - A new sysctl to change the number of SYN retransmitted with MPTCP >> options before falling back to TCP. The modification looks simple >> enough to still be sent to netdev before the closure I think. (applied) >> >> - A fix to only turn on the blackhole protection only when the first SYN >> retransmitted without MPTCP option is accepted, instead of any after. >> The blackhole feature was supposed to do that from the beginning, but >> a check was wrongly placed. I think we should consider this as a fix, >> even if there are also risks of not detecting a blackhole if the first >> SYN retransmitted without MPTCP is dropped by accident. But that seems >> more unlikely for an "MPTCP firewall blackhole", and I guess not all >> future MPTCP connections will behave exactly like that. It sounds then >> safer to reduce the possibilities of enabling the blackhole protection >> by accident, and apply this patch. >> >> - A small cleanup to exit early. >> >> Please note that patches 1 and 2 have already been applied in our tree. >> So this version only has patches 3 and 4. Patch 3 is a fix, maybe not >> patch 4, but it might be easier to squash patch 3 and 4 and send it as a >> fix to -net. >> >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >> --- >> Changes in v2: >> - Previous patches 1 & 2 have been applied. >> - Patch 1 (ex 3) has not been modified. >> - Patch 2 has been added, it can be squashed to the previous one. >> - Link to v1: https://lore.kernel.org/r/20250114-mpc-no-blackhole- >> v1-0-994bd2a357fb@kernel.org >> > > Hi Matthieu - > > Both patches in this series look good to me: > > Reviewed-by: Mat Martineau <martineau@kernel.org> Thank you for the review! > I'd suggest either queuing patch 2 for -next, or dropping it. The > optimizer probably makes it a no-op in terms of performance anyway, and > now looking at the size of the patch (even though it's mostly due to re- > indenting and moving a couple hunks of code around) it does seem like a > moderate amount of churn. I just applied the two patches. Even if the optimizer might do a good job, the new 'likely()' and the probably more logical way the conditions are checked might help at least for the code readability (and not to have to rely on the optimizer). New patches for t/upstream-net and t/upstream: - fa00cb6c4848: mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted - Results: 5be385713aa4..f6c96ef8878b (export-net) - Results: 3b679399ffaf..f912778a1fd3 (export) New patches for t/upstream: - bf133309e373: mptcp: blackhole: avoid checking the state twice - Results: f912778a1fd3..e80a31769cf4 (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/99817b2f2308bb041aab3ef0137999e203dacc0a/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/d591e451861a0d3ae05f9a9501ba687cf786f183/checks Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12908221707 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/922f9e419c68 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=927518 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Critical: Global Timeout ❌ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12908221707 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/922f9e419c68 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=927518 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2025 Red Hat, Inc.