[v1] Squash-to: "mptcp: leverage the backlog for RX packet processing"

[PATCH mptcp-next] Squash-to: "mptcp: leverage the backlog for RX packet processing"

Posted by Paolo Abeni 2 weeks, 5 days ago

If a subflow receives data before gaining the memcg while the msk
socket lock is held at accept time, or the PM locks the msk socket
while still unaccepted and subflows push data to it at the same time,
the mptcp_graph_subflows() can complete with a non empty backlog.

The msk will try to borrow such memory, but (some) of the skbs there
where not memcg charged. When the msk finally will return such accounted
memory, we should hit the same splat of #597.
[even if so far I was unable to replicate this scenario]

This patch tries to address such potential issue by:
- preventing the subflow from queuing data into the backlog after
  gaining the memcg. This ensure that at the end of the look all the
  skbs in the backlog (if any) are _not_ memory accounted.
- mem charge the backlog to msk
- 'restart' the subflow and spool any data waiting there.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 5e9325c7ea9c..7e58d35f6b13 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -4082,10 +4082,12 @@ static void mptcp_graph_subflows(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow;
 	struct mptcp_sock *msk = mptcp_sk(sk);
+	struct sock *ssk;
+	int old_amt, amt;
+	bool slow;
 
 	mptcp_for_each_subflow(msk, subflow) {
-		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-		bool slow;
+		ssk = mptcp_subflow_tcp_sock(subflow);
 
 		slow = lock_sock_fast(ssk);
 
@@ -4095,8 +4097,46 @@ static void mptcp_graph_subflows(struct sock *sk)
 		if (!ssk->sk_socket)
 			mptcp_sock_graft(ssk, sk->sk_socket);
 
+		if (!mem_cgroup_from_sk(sk))
+			continue;
+
 		__mptcp_inherit_cgrp_data(sk, ssk);
 		__mptcp_inherit_memcg(sk, ssk, GFP_KERNEL);
+
+		/* Prevent subflows from queueing data into the backlog
+		 * as soon as cg is set; note that we can't race
+		 * with __mptcp_close_ssk setting this bit for a really
+		 * closing socket, because we hold the msk socket lock here.
+		 */
+		subflow->closing = 1;
+		unlock_sock_fast(ssk, slow);
+	}
+
+	if (!mem_cgroup_from_sk(sk))
+		return;
+
+	/* Charge the bl memory, note that __sk_charge accounted for
+	 * fwd memory and rmem only
+	 */
+	mptcp_data_lock(sk);
+	old_amt = sk_mem_pages(sk->sk_forward_alloc +
+			       atomic_read(&sk->sk_rmem_alloc));
+	amt = sk_mem_pages(msk->backlog_len + sk->sk_forward_alloc +
+		     atomic_read(&sk->sk_rmem_alloc));
+	amt -= old_amt;
+	if (amt)
+		mem_cgroup_sk_charge(sk, amt, GFP_ATOMIC | __GFP_NOFAIL);
+	mptcp_data_unlock(sk);
+
+	/* Finally let the subflow restart queuing data. */
+	mptcp_for_each_subflow(msk, subflow) {
+		ssk = mptcp_subflow_tcp_sock(subflow);
+
+		slow = lock_sock_fast(ssk);
+		subflow->closing = 0;
+
+		if (mptcp_subflow_data_available(ssk))
+			mptcp_data_ready(sk, ssk);
 		unlock_sock_fast(ssk, slow);
 	}
 }
-- 
2.51.0

Re: [PATCH mptcp-next] Squash-to: "mptcp: leverage the backlog for RX packet processing"

Posted by Paolo Abeni 2 weeks, 4 days ago

On Fri, Nov 7, 2025 at 11:31 PM Paolo Abeni <pabeni@redhat.com> wrote:
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 5e9325c7ea9c..7e58d35f6b13 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -4082,10 +4082,12 @@ static void mptcp_graph_subflows(struct sock *sk)
>  {
>         struct mptcp_subflow_context *subflow;
>         struct mptcp_sock *msk = mptcp_sk(sk);
> +       struct sock *ssk;
> +       int old_amt, amt;
> +       bool slow;
>
>         mptcp_for_each_subflow(msk, subflow) {
> -               struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> -               bool slow;
> +               ssk = mptcp_subflow_tcp_sock(subflow);
>
>                 slow = lock_sock_fast(ssk);
>
> @@ -4095,8 +4097,46 @@ static void mptcp_graph_subflows(struct sock *sk)
>                 if (!ssk->sk_socket)
>                         mptcp_sock_graft(ssk, sk->sk_socket);
>
> +               if (!mem_cgroup_from_sk(sk))
> +                       continue;

Oops, the above last minute optimization is blatantly wrong: must
release the lock before continuing

/P

Re: [PATCH mptcp-next] Squash-to: "mptcp: leverage the backlog for RX packet processing"

Posted by MPTCP CI 2 weeks, 4 days ago

Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Critical: 5 Call Trace(s) - Critical: Global Timeout ❌
- KVM Validation: normal (only selftest_mptcp_join): Critical: 5 Call Trace(s) - Critical: Unexpected stop of the VM ❌
- KVM Validation: debug (except selftest_mptcp_join): Critical: 5 Call Trace(s) - Critical: Global Timeout ❌
- KVM Validation: debug (only selftest_mptcp_join): Critical: 6 Call Trace(s) - Critical: Unexpected stop of the VM ❌
- KVM Validation: btf-normal (only bpftest_all): Critical: 4 Call Trace(s) - Critical: Global Timeout ❌
- KVM Validation: btf-debug (only bpftest_all): Critical: 6 Call Trace(s) - Critical: Unexpected stop of the VM ❌
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19194275866

Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/83be86fdeda7
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1021097


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

Re: [PATCH mptcp-next] Squash-to: "mptcp: leverage the backlog for RX packet processing"

Posted by MPTCP CI 2 weeks, 4 days ago

Hi Paolo,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19194275874

Status: failure
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/83be86fdeda7
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1021097

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)