[v7] mptcp: autotune related improvement

[PATCH v7 mptcp-next 2/6] mptcp: do not account for OoO in mptcp_rcvbuf_grow()

Posted by Paolo Abeni 1 week ago

MPTCP-level OoOs are physiological when multiple subflows are active
concurrently and will not cause retransmissions nor are caused by
drops.

Accounting for them in mptcp_rcvbuf_grow() causes the rcvbuf slowly
drifting towards tcp_rmem[2].

Remove such accounting. Note that subflows will still account for TCP-level
OoO when the MPTCP-level rcvbuf is propagated.

This also closes a subtle and very unlikely race condition with rcvspace
init; active sockets with user-space holding the msk-level socket lock,
could complete such initialization in the receive callback, after that the
first OoO data reaches the rcvbuf and potentially triggering a divide by
zero Oops.

Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 64f592a7897c..e31ccc4bbb2d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -226,9 +226,6 @@ static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
 	do_div(grow, oldval);
 	rcvwin += grow << 1;
 
-	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
-		rcvwin += MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq;
-
 	cap = READ_ONCE(net->ipv4.sysctl_tcp_rmem[2]);
 
 	rcvbuf = min_t(u32, mptcp_space_from_win(sk, rcvwin), cap);
@@ -352,9 +349,6 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *msk, struct sk_buff *skb)
 end:
 	skb_condense(skb);
 	skb_set_owner_r(skb, sk);
-	/* do not grow rcvbuf for not-yet-accepted or orphaned sockets. */
-	if (sk->sk_socket)
-		mptcp_rcvbuf_grow(sk, msk->rcvq_space.space);
 }
 
 static void mptcp_init_skb(struct sock *ssk, struct sk_buff *skb, int offset,
-- 
2.51.1

Re: [PATCH v7 mptcp-next 2/6] mptcp: do not account for OoO in mptcp_rcvbuf_grow()

Posted by Mat Martineau 12 hours ago

On Thu, 20 Nov 2025, Paolo Abeni wrote:

> MPTCP-level OoOs are physiological when multiple subflows are active
> concurrently and will not cause retransmissions nor are caused by
> drops.
>
> Accounting for them in mptcp_rcvbuf_grow() causes the rcvbuf slowly
> drifting towards tcp_rmem[2].
>
> Remove such accounting. Note that subflows will still account for TCP-level
> OoO when the MPTCP-level rcvbuf is propagated.
>
> This also closes a subtle and very unlikely race condition with rcvspace
> init; active sockets with user-space holding the msk-level socket lock,
> could complete such initialization in the receive callback, after that the
> first OoO data reaches the rcvbuf and potentially triggering a divide by
> zero Oops.
>
> Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/mptcp/protocol.c | 6 ------
> 1 file changed, 6 deletions(-)

Looks good, thanks Paolo:

Reviewed-by: Mat Martineau <martineau@kernel.org>


>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 64f592a7897c..e31ccc4bbb2d 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -226,9 +226,6 @@ static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
> 	do_div(grow, oldval);
> 	rcvwin += grow << 1;
>
> -	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
> -		rcvwin += MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq;
> -
> 	cap = READ_ONCE(net->ipv4.sysctl_tcp_rmem[2]);
>
> 	rcvbuf = min_t(u32, mptcp_space_from_win(sk, rcvwin), cap);
> @@ -352,9 +349,6 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *msk, struct sk_buff *skb)
> end:
> 	skb_condense(skb);
> 	skb_set_owner_r(skb, sk);
> -	/* do not grow rcvbuf for not-yet-accepted or orphaned sockets. */
> -	if (sk->sk_socket)
> -		mptcp_rcvbuf_grow(sk, msk->rcvq_space.space);
> }
>
> static void mptcp_init_skb(struct sock *ssk, struct sk_buff *skb, int offset,
> -- 
> 2.51.1
>
>

[PATCH v7 mptcp-next 1/6] trace: mptcp: add mptcp_rcvbuf_grow tracepoint
[PATCH v7 mptcp-next 2/6] mptcp: do not account for OoO in mptcp_rcvbuf_grow()
[PATCH v7 mptcp-next 3/6] mptcp: fix receive space timestamp initialization.
[PATCH v7 mptcp-next 4/6] mptcp: consolidate rcv space init
[PATCH v7 mptcp-next 5/6] mptcp: better mptcp-level RTT estimator
[PATCH v7 mptcp-next 6/6] mptcp: add receive queue awareness in tcp_rcv_space_adjust()