[PATCH mptcp-net] mptcp: fix ack generation for fallback msk

Paolo Abeni posted 1 patch 2 weeks, 6 days ago
Failed in applying to current master (apply log)
net/mptcp/options.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
[PATCH mptcp-net] mptcp: fix ack generation for fallback msk
Posted by Paolo Abeni 2 weeks, 6 days ago
mptcp_cleanup_rbuf() need to know the last most recent, mptcp-level
rcv_wnd sent, and such information is tracked into the msk->old_wspace
field, updated at ack transmission time by mptcp_write_options().

Fallback socket do not add any mptcp options, such helper is never
invoked, and msk->old_wspace value remain stale. That in turn makes
ack generation at recvmsg() time quite randomic.

Address the issue ensuring mptcp_write_options() is invoked even
for fallback sockets, and just update the needed info in such a
case.

The issue went unnoticed for a long time, as mptcp overshot the
fallback socket receive buffer autotune significantly.

Fixes: e3859603ba13 ("mptcp: better msk receive window updates")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ad51dcf18984..8d3b0f9754dc 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -838,8 +838,11 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
 
 	opts->suboptions = 0;
 
+	/* Force later mptcp_write_options(), but do not use any actual
+	 * option space.
+	 */
 	if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb)))
-		return false;
+		return true;
 
 	if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) {
 		if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) ||
@@ -1351,6 +1354,20 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
 	subflow->rcv_wnd_sent = rcv_wnd_new;
 }
 
+static void mptcp_track_rwin(struct tcp_sock *tp)
+{
+	const struct sock *ssk = (const struct sock *)tp;
+	struct mptcp_subflow_context *subflow;
+	struct mptcp_sock *msk;
+
+	if (!ssk)
+		return;
+
+	subflow = mptcp_subflow_ctx(ssk);
+	msk = mptcp_sk(subflow->conn);
+	WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
+}
+
 __sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
 {
 	struct csum_pseudo_header header;
@@ -1643,6 +1660,9 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
 				      opts->reset_transient,
 				      opts->reset_reason);
 		return;
+	} else if (unlikely(!opts->suboptions)) {
+		mptcp_track_rwin(tp);
+		return;
 	}
 
 	if (OPTION_MPTCP_PRIO & opts->suboptions) {
-- 
2.51.0
Re: [PATCH mptcp-net] mptcp: fix ack generation for fallback msk
Posted by Matthieu Baerts 2 weeks, 1 day ago
Hi Paolo,

On 07/11/2025 01:10, Paolo Abeni wrote:
> mptcp_cleanup_rbuf() need to know the last most recent, mptcp-level
> rcv_wnd sent, and such information is tracked into the msk->old_wspace
> field, updated at ack transmission time by mptcp_write_options().
> 
> Fallback socket do not add any mptcp options, such helper is never
> invoked, and msk->old_wspace value remain stale. That in turn makes
> ack generation at recvmsg() time quite randomic.
> 
> Address the issue ensuring mptcp_write_options() is invoked even
> for fallback sockets, and just update the needed info in such a
> case.
> 
> The issue went unnoticed for a long time, as mptcp overshot the> fallback socket receive buffer autotune significantly.

Should we say "used to overshot"? :)

> Fixes: e3859603ba13 ("mptcp: better msk receive window updates")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594

Thank you for the fix!

(...)

> @@ -1643,6 +1660,9 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
>  				      opts->reset_transient,
>  				      opts->reset_reason);
>  		return;
> +	} else if (unlikely(!opts->suboptions)) {

Small detail: do you mind if I add /* Fallback to TCP */ just above when
applying the patch?

> +		mptcp_track_rwin(tp);

Should we eventually also call mptcp_track_rwin(tp) in case of MP_FAIL
without MP_RST? Or is this not needed?

Other than that, it looks good to me, and I guess we can already apply
the patch:

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

> +		return;
>  	}
>  
>  	if (OPTION_MPTCP_PRIO & opts->suboptions) {

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-net] mptcp: fix ack generation for fallback msk
Posted by Matthieu Baerts 2 weeks, 1 day ago
Hi Paolo,

On 11/11/2025 16:30, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 07/11/2025 01:10, Paolo Abeni wrote:
>> mptcp_cleanup_rbuf() need to know the last most recent, mptcp-level
>> rcv_wnd sent, and such information is tracked into the msk->old_wspace
>> field, updated at ack transmission time by mptcp_write_options().
>>
>> Fallback socket do not add any mptcp options, such helper is never
>> invoked, and msk->old_wspace value remain stale. That in turn makes
>> ack generation at recvmsg() time quite randomic.
>>
>> Address the issue ensuring mptcp_write_options() is invoked even
>> for fallback sockets, and just update the needed info in such a
>> case.
>>
>> The issue went unnoticed for a long time, as mptcp overshot the> fallback socket receive buffer autotune significantly.
> 
> Should we say "used to overshot"? :)
> 
>> Fixes: e3859603ba13 ("mptcp: better msk receive window updates")
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594
> 
> Thank you for the fix!
> 
> (...)
> 
>> @@ -1643,6 +1660,9 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
>>  				      opts->reset_transient,
>>  				      opts->reset_reason);
>>  		return;
>> +	} else if (unlikely(!opts->suboptions)) {
> 
> Small detail: do you mind if I add /* Fallback to TCP */ just above when
> applying the patch?
> 
>> +		mptcp_track_rwin(tp);
> 
> Should we eventually also call mptcp_track_rwin(tp) in case of MP_FAIL
> without MP_RST? Or is this not needed?
> 
> Other than that, it looks good to me, and I guess we can already apply
> the patch:

That's what I just did: I guess the modification for MP_FAIL, if really
needed, can still be added later on.

New patches for t/upstream-net and t/upstream:
- c3a571ff59cc: mptcp: fix ack generation for fallback msk
- Results: 751a2432b7d8..5b6b4e3aef4f (export-net)
- Results: 247540d77d20..a96affe5f0a1 (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/7fade8917eeabe227b5ffae540e9cbe48d3e5046/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/b10c1d3d1e5da09bbc2b81eca8f4339b84499268/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-net] mptcp: fix ack generation for fallback msk
Posted by Geliang Tang 2 weeks, 2 days ago
Hi Paolo,

Thanks for this fix.

On Fri, 2025-11-07 at 01:10 +0100, Paolo Abeni wrote:
> mptcp_cleanup_rbuf() need to know the last most recent, mptcp-level
> rcv_wnd sent, and such information is tracked into the msk-
> >old_wspace
> field, updated at ack transmission time by mptcp_write_options().
> 
> Fallback socket do not add any mptcp options, such helper is never
> invoked, and msk->old_wspace value remain stale. That in turn makes
> ack generation at recvmsg() time quite randomic.
> 
> Address the issue ensuring mptcp_write_options() is invoked even
> for fallback sockets, and just update the needed info in such a
> case.
> 
> The issue went unnoticed for a long time, as mptcp overshot the
> fallback socket receive buffer autotune significantly.
> 
> Fixes: e3859603ba13 ("mptcp: better msk receive window updates")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

This patch looks good to me!

    Reviewed-by: Geliang Tang <geliang@kernel.org>

-Geliang

> ---
>  net/mptcp/options.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index ad51dcf18984..8d3b0f9754dc 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -838,8 +838,11 @@ bool mptcp_established_options(struct sock *sk,
> struct sk_buff *skb,
>  
>  	opts->suboptions = 0;
>  
> +	/* Force later mptcp_write_options(), but do not use any
> actual
> +	 * option space.
> +	 */
>  	if (unlikely(__mptcp_check_fallback(msk) &&
> !mptcp_check_infinite_map(skb)))
> -		return false;
> +		return true;
>  
>  	if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags &
> TCPHDR_RST)) {
>  		if (mptcp_established_options_fastclose(sk,
> &opt_size, remaining, opts) ||
> @@ -1351,6 +1354,20 @@ static void mptcp_set_rwin(struct tcp_sock
> *tp, struct tcphdr *th)
>  	subflow->rcv_wnd_sent = rcv_wnd_new;
>  }
>  
> +static void mptcp_track_rwin(struct tcp_sock *tp)
> +{
> +	const struct sock *ssk = (const struct sock *)tp;
> +	struct mptcp_subflow_context *subflow;
> +	struct mptcp_sock *msk;
> +
> +	if (!ssk)
> +		return;
> +
> +	subflow = mptcp_subflow_ctx(ssk);
> +	msk = mptcp_sk(subflow->conn);
> +	WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
> +}
> +
>  __sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16
> data_len, __wsum sum)
>  {
>  	struct csum_pseudo_header header;
> @@ -1643,6 +1660,9 @@ void mptcp_write_options(struct tcphdr *th,
> __be32 *ptr, struct tcp_sock *tp,
>  				      opts->reset_transient,
>  				      opts->reset_reason);
>  		return;
> +	} else if (unlikely(!opts->suboptions)) {
> +		mptcp_track_rwin(tp);
> +		return;
>  	}
>  
>  	if (OPTION_MPTCP_PRIO & opts->suboptions) {

Re: [PATCH mptcp-net] mptcp: fix ack generation for fallback msk
Posted by MPTCP CI 2 weeks, 6 days ago
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_dss 🔴
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19161858402

Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/032d2f7924de
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1020688


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)