From: Christoph Paasch <cpaasch@openai.com>
When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(),
we used to return true, letting the segment proceed through the TCP
receive path without a DSS mapping. Such segments can leave inconsistent
mapping state and trigger a mid-stream fallback to TCP, which in testing
collapsed (by artificially forcing failures in skb_ext_add) throughput
to zero.
Return false instead so the TCP input path drops the skb (see
tcp_data_queue() and step-7 processing). This is the safer choice
under memory pressure: it preserves MPTCP correctness and provides
backpressure to the sender.
Control packets remain unaffected: ACK updates and DATA_FIN handling
happen before attempting the extension allocation, and tcp_reset()
continues to ignore the return value.
With this change, MPTCP continues to work at high throughput if we
artificially inject failures into skb_ext_add.
Fixes: 6787b7e350d3 ("mptcp: avoid processing packet if a subflow reset")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
---
net/mptcp/options.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 3c08beffbfea13597f14df17b1636b154e471398..d47b8a9bc2df2f14645b1b3d3e10fea1b38567b1 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1119,7 +1119,9 @@ static bool add_addr_hmac_valid(struct mptcp_sock *msk,
return hmac == mp_opt->ahmac;
}
-/* Return false if a subflow has been reset, else return true */
+/* Return false in case of error (or subflow has been reset),
+ * else return true.
+ */
bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
@@ -1223,7 +1225,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
if (!mpext)
- return true;
+ return false;
memset(mpext, 0, sizeof(*mpext));
---
base-commit: dc818e7d153ed5ff81b9fcec7baa3b4b78725f89
change-id: 20250805-mptcp_fixes-dd7e04528f32
Best regards,
--
Christoph Paasch <cpaasch@openai.com>
Hi Christoph, On 07/08/2025 20:00, Christoph Paasch via B4 Relay wrote: > From: Christoph Paasch <cpaasch@openai.com> > > When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(), > we used to return true, letting the segment proceed through the TCP > receive path without a DSS mapping. Such segments can leave inconsistent > mapping state and trigger a mid-stream fallback to TCP, which in testing > collapsed (by artificially forcing failures in skb_ext_add) throughput > to zero. > > Return false instead so the TCP input path drops the skb (see > tcp_data_queue() and step-7 processing). This is the safer choice > under memory pressure: it preserves MPTCP correctness and provides > backpressure to the sender. > > Control packets remain unaffected: ACK updates and DATA_FIN handling > happen before attempting the extension allocation, and tcp_reset() > continues to ignore the return value. > > With this change, MPTCP continues to work at high throughput if we > artificially inject failures into skb_ext_add. Good catch! Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> BTW, how did you detect this error? Just in case it would be interesting to have our CI running additional automated tests if that's possible and how you got the error. > Fixes: 6787b7e350d3 ("mptcp: avoid processing packet if a subflow reset") I guess we could even point to 648ef4b88673 ("mptcp: Implement MPTCP receive path"), but for the backports, that will be the same, so fine like that. Cheers, Matt -- Sponsored by the NGI0 Core fund.
On 14/08/2025 12:36, Matthieu Baerts wrote: > Hi Christoph, > > On 07/08/2025 20:00, Christoph Paasch via B4 Relay wrote: >> From: Christoph Paasch <cpaasch@openai.com> >> >> When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(), >> we used to return true, letting the segment proceed through the TCP >> receive path without a DSS mapping. Such segments can leave inconsistent >> mapping state and trigger a mid-stream fallback to TCP, which in testing >> collapsed (by artificially forcing failures in skb_ext_add) throughput >> to zero. >> >> Return false instead so the TCP input path drops the skb (see >> tcp_data_queue() and step-7 processing). This is the safer choice >> under memory pressure: it preserves MPTCP correctness and provides >> backpressure to the sender. >> >> Control packets remain unaffected: ACK updates and DATA_FIN handling >> happen before attempting the extension allocation, and tcp_reset() >> continues to ignore the return value. >> >> With this change, MPTCP continues to work at high throughput if we >> artificially inject failures into skb_ext_add. > > Good catch! > > Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > > BTW, how did you detect this error? Just in case it would be interesting > to have our CI running additional automated tests if that's possible and > how you got the error. (we can discuss ↑ later on, this shouldn't block the application of the patches :) ) Applied in our tree (fixes for -net): New patches for t/upstream-net and t/upstream: - 9c1c7316b15d: mptcp: drop skb if MPTCP skb extension allocation fails - Results: 3240257a2f06..d412bf5ae17a (export-net) - Results: c0563ee9fbbf..64863e4ad387 (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/16f23c2d3a821edca65217999c16db17c0ea1fce/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/a5222ce1146fbd007b59cfb0d36974816c7d1b86/checks Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi Christoph, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Unstable: 1 failed test(s): selftest_mptcp_connect 🔴 - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/16812517311 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/afb4495bb7ee Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=989206 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2025 Red Hat, Inc.