[RESEND PATCH mptcp-net] mptcp: sync the msk->sndbuf at accept() time

Gang Yan posted 1 patch 2 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20260306062316.1333680-1-gang.yan@linux.dev
net/mptcp/protocol.c | 1 +
1 file changed, 1 insertion(+)
[RESEND PATCH mptcp-net] mptcp: sync the msk->sndbuf at accept() time
Posted by Gang Yan 2 weeks, 2 days ago
From: Gang Yan <yangang@kylinos.cn>

After an MPTCP connection is established, the sk_sndbuf of client's msk
can be updated through 'subflow_finish_connect'. However, the newly
accepted msk on the server side has a small sk_sndbuf than
msk->first->sk_sndbuf:

'''
MPTCP: msk:00000000e55b09db, msk->sndbuf:20480, msk->first->sndbuf:2626560
'''

This means that when the server immediately sends MSG_DONTWAIT data to
the client after the connection is established, it is more likely to
encounter EAGAIN.

This patch synchronizes the sk_sndbuf by triggering its update during accept.

Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/602
Signed-off-by: Gang Yan <yangang@kylinos.cn>
---

Notes:
    Hi Paolo, Matt,
    
    Sorry for the late response for this patch. I've been analyzing this
    issue recently, and the basic picture is as follows:
    
    The root cause is a timing gap between msk creation and TCP sndbuf
    auto-tuning on the server side:
    
    1. When the server receives the SYN, mptcp_sk_clone_init() creates the
       msk and calls __mptcp_propagate_sndbuf(). At this point, the TCP
       subflow is still in SYN_RCVD state, so its sk_sndbuf has only the
       initial value (tcp_wmem[1], typically ~16KB).
    
    2. When the 3-way handshake completes (ACK received), the TCP stack
       calls tcp_init_buffer_space() -> tcp_sndbuf_expand(), which grows
       the subflow's sk_sndbuf based on MSS, congestion window, etc.
       (potentially up to tcp_wmem[2], ~4MB).
    
    3. However, this auto-tuning happens deep in the TCP stack without
       any callback to MPTCP, so msk->sk_sndbuf is never updated to
       reflect the new subflow sndbuf value.
    
    4. When accept() returns, msk->sk_sndbuf still holds the small initial
       value, while msk->first->sk_sndbuf has been auto-tuned to a much
       larger value.
    
    In contrast, the active (client) side doesn't have this issue because
    subflow_finish_connect() calls mptcp_propagate_state() after the TCP
    sndbuf auto-tuning has already occurred, ensuring proper synchronization.
    
    Thanks
    Gang

 net/mptcp/protocol.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b5676b37f8f4..17e43aff4459 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -4232,6 +4232,7 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 
 		mptcp_graft_subflows(newsk);
 		mptcp_rps_record_subflows(msk);
+		__mptcp_propagate_sndbuf(newsk, mptcp_subflow_tcp_sock(subflow));
 
 		/* Do late cleanup for the first subflow as necessary. Also
 		 * deal with bad peers not doing a complete shutdown.
-- 
2.43.0
Re: [RESEND PATCH mptcp-net] mptcp: sync the msk->sndbuf at accept() time
Posted by MPTCP CI 2 weeks, 1 day ago
Hi Gang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Notice: Boot failures, rebooted and continued 🔴
- KVM Validation: normal (only selftest_mptcp_join): Notice: Call Traces at boot time, rebooted and continued 🔴
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/22752507226

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/56845d31abb3
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1062356


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)