From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CC3517A2EC;
	Tue, 18 Feb 2025 18:36:31 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903792; cv=none;
 b=PG9S90cExD7WYHtJBb9DgTuioEX+PwcOQMmFbyMfCCcs4LjZhOuMCWKqqeGXwYASO2IMVHzw6CLQzOLMKs2E1T+Iuf2UELuZG+bdbsaZdopBk7JgIoqzRTpwJG4Nty8+zm5jjRrHFVsJ3PggOW6ypODc7Tv8zzVSpqWcDvFbkng=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903792; c=relaxed/simple;
	bh=ACPJ+yPY2W1joc1pWuexOVqHoRklIHDbYoAvIJBUces=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=IivbUl05UoUliazvhm92tpDHtLxLU2kplD+Wef3fkA+yIlNMgjjUKOJsw5JOMmtVY9sjdZ1wFm+9qdfXRndi4vJuF4DUv1txNd0EHayvKGfLz2u+g7qsuv0UTlCuLj/av0SP7JNEgSBpfYnXlVTN+u+MbY+skTHPyajzO3SeWSg=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=vCVhDslj; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="vCVhDslj"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id AA63DC4CEEC;
	Tue, 18 Feb 2025 18:36:27 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903790;
	bh=ACPJ+yPY2W1joc1pWuexOVqHoRklIHDbYoAvIJBUces=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=vCVhDsljVOLC+ZuCrbHzsCMseOuvPK6IM8frJ6YPpjw/ofh8Gyme3GvfSIbY62w+L
	 Q/DMBlraTT0WxRBy6V1d3uJBAhEV3r8HnfFIjcKRbchcmIPyrnog7moRoxE2py5eEp
	 KXNgDfHOrmWbw4Bdrw5Ow97ckMX5RVdoiwxbDT7mrlJ64CkbjSC/ywj3q8Fzd/odaw
	 UR0qrwzX07WWlmyW12wElxeQLEKM3b5T3q1VZ1Hs56RLpJEYv60wKD2abTgkhdfDai
	 D6v5x9rbpHhTLzJ8QuhafgiQCZlziqt/UD2fkVmKXovTaIPqnnE377U5nAuf12CREt
	 Ym9hDOqMDXB6A==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:12 +0100
Subject: [PATCH net-next 1/7] mptcp: consolidate subflow cleanup
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-1-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=3128; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=zRyPHzOQB2XFEMTBCqtrVF98bVbR+f2SqeWZUx2E3Mk=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMnQ625mFfmmXvXNxxfMGbI3LG3oiO7dwtQo
 VC68j5mG52JAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 cwC8D/43bwVvWQSnpOYMdB0IUUji3qmRDXfax0ap6JB5FwA6aPGzzySU8tynCEX2ZswNINX6ytJ
 vlAceGeapAN+SA6Of2kvo8vmlcBinvb1Unb50dTwjgx5fUY4/6rwYRAEXbanR2jOw87T7MHzIGq
 ozFcB/KKveDT3wM8Q1/tktDBf6SW7eyPvDsAciZpSe2keAN42A+s9rwi2wM5jiqn5BcrxGl85bw
 X9n2wKvoirZ0pYQWsa8f3O3VZeksUsPiFZK13dTtnFWbmCYrVtplUnPmFSJZQFf4sNHdtkZvP5t
 jQkf0D9PfKQ+DrofrCJpNQRDRtyYMc8SivCafViYBKSFzYAhR5UkJJWbCQbUCPEQw/fYZhtYAc7
 hhfAOJi6VfLrYg3khVuFZg29mrCJrHW0hzDKE5FjMxaBN/RPpcnPUwwJybgbNsPdhvMXNNEcRK7
 LeHnQGdXXR8OQterDWgACO2laWpUnGuoLXoXXwBnz+kX2RL/tRBj9Es5ImYrBzroBldhtDfPJWk
 rSxHqaDIGedrP2G/lptfQbarakYVXi+/ancBR5ULW/SS5MmB995Vz21664xEdtlp9XjmpLifThc
 J2u4bAb+hgiLLyNtputga+N9trgyN6qCmmBv7hF4PkJJP2uXvFpZYqwKK03xLstWOBdraruMrVq
 otSK2CtPyRpqmfA==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

Consolidate all the cleanup actions requiring the worker in a single
helper and ensure the dummy data fin creation for fallback socket is
performed only when the tcp rx queue is empty.

There are no functional changes intended, but this will simplify the
next patch, when the tcp rx queue spooling could be delayed at release_cb
time.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/subflow.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index fd021cf8286eff9234b950a4d4c083ea7756eba3..2926bdf88e42c5f2db6875b00b4=
eca2dbf49dba2 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1271,7 +1271,12 @@ static void mptcp_subflow_discard_data(struct sock *=
ssk, struct sk_buff *skb,
 		subflow->map_valid =3D 0;
 }
=20
-/* sched mptcp worker to remove the subflow if no more data is pending */
+static bool subflow_is_done(const struct sock *sk)
+{
+	return sk->sk_shutdown & RCV_SHUTDOWN || sk->sk_state =3D=3D TCP_CLOSE;
+}
+
+/* sched mptcp worker for subflow cleanup if no more data is pending */
 static void subflow_sched_work_if_closed(struct mptcp_sock *msk, struct so=
ck *ssk)
 {
 	struct sock *sk =3D (struct sock *)msk;
@@ -1281,8 +1286,18 @@ static void subflow_sched_work_if_closed(struct mptc=
p_sock *msk, struct sock *ss
 		    inet_sk_state_load(sk) !=3D TCP_ESTABLISHED)))
 		return;
=20
-	if (skb_queue_empty(&ssk->sk_receive_queue) &&
-	    !test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags))
+	if (!skb_queue_empty(&ssk->sk_receive_queue))
+		return;
+
+	if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags))
+		mptcp_schedule_work(sk);
+
+	/* when the fallback subflow closes the rx side, trigger a 'dummy'
+	 * ingress data fin, so that the msk state will follow along
+	 */
+	if (__mptcp_check_fallback(msk) && subflow_is_done(ssk) &&
+	    msk->first =3D=3D ssk &&
+	    mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true))
 		mptcp_schedule_work(sk);
 }
=20
@@ -1842,11 +1857,6 @@ static void __subflow_state_change(struct sock *sk)
 	rcu_read_unlock();
 }
=20
-static bool subflow_is_done(const struct sock *sk)
-{
-	return sk->sk_shutdown & RCV_SHUTDOWN || sk->sk_state =3D=3D TCP_CLOSE;
-}
-
 static void subflow_state_change(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(sk);
@@ -1873,13 +1883,6 @@ static void subflow_state_change(struct sock *sk)
 		subflow_error_report(sk);
=20
 	subflow_sched_work_if_closed(mptcp_sk(parent), sk);
-
-	/* when the fallback subflow closes the rx side, trigger a 'dummy'
-	 * ingress data fin, so that the msk state will follow along
-	 */
-	if (__mptcp_check_fallback(msk) && subflow_is_done(sk) && msk->first =3D=
=3D sk &&
-	    mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true))
-		mptcp_schedule_work(parent);
 }
=20
 void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *list=
ener_ssk)

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04A0A1F5822;
	Tue, 18 Feb 2025 18:36:34 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903796; cv=none;
 b=rYrq0ACskQ/uLI35l+0i1FMIbvSWPgE2UR6oH0BA6ARDypFZFHePT6mw+6aWNEv8T909edkX3Ld54ECfhS+gJSuSrw2Qf86eu98j+yKcp5Nqw12ZyCxDBlUcntvC6KoRXeMvgrzmeNFoclICrm+O+xTnVTxq1lwbjJNUls7yPX0=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903796; c=relaxed/simple;
	bh=wuahuA5ylgrgljO5VDYDx96x4R3nL275ICd3P5Wv79o=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=dtNsDniIQaOpFkZv9O88qOzW5Q0BIV3KapOTFGL+Y2sW2ZRQkPTEZQIK4TlWigbNZseNYBkpdsIxgEA3MoPDfYWUPD30GOHcFQY4o1dqcrUQwN0YsItiLj4mDdCb/CU6HRL377KPaGtFgxI0KNppzhWqpn37odex4Ab99m91oSU=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=qrTWpMq+; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="qrTWpMq+"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 508F6C4CEE2;
	Tue, 18 Feb 2025 18:36:31 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903794;
	bh=wuahuA5ylgrgljO5VDYDx96x4R3nL275ICd3P5Wv79o=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=qrTWpMq+FcCb2iAeqN9Nq26JmItHNfZqGiDWr+YEEJp9IqvQ2B3LGgL68Qy47hmcK
	 QlicUzF2TgSiexpQzZQy/Tb/NK2Vd5/kXyy0YiZnZr3S8qcZkAyFi3gEeTF3eIDuRB
	 R8St2XEoqmi/7u+XqKwjbcgL8KirdWZkN7N+OBhohktze4jXeuI6CRlh49i0Wkg9Ao
	 9JyPg0z6YPWH0Zj+IvJwxxalosNdCeGX7DSMjZbpISXyWgJ7oPWzCNAgjdj9/dacUh
	 XTLeOZRWdCRNk2SX58a7QbwoDa2EZAN+ogn8S58VtGOw3LTT8DX1TIqo52Vx/JqViD
	 1ErvGwC+vg8GA==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:13 +0100
Subject: [PATCH net-next 2/7] mptcp: drop __mptcp_fastopen_gen_msk_ackseq()
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-2-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=5353; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=OPeer42AvAhJ9v5YycxG/rFJZR0B5WxAYUWtDju/5ys=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMn3S2M7io/paYz65v9IzpjAlduxP+x+sXi3
 +67ZLFaKGGJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 c1qiD/9Zaj2zIeXq2gAh0aMD6rCOjJgPQcBxj/5qVOI5uyr/JXenv+4s0lzx4Cpa0smO/res+S1
 9oyGQrMidxzvJW8jbXokJexrxGWGoSpntqSyW4zUUsPDoWbOE9xuTrUfCX6dIKIYBEbW7rxWytf
 TjsdydbgruOnU3R+GQE0u/rCG2BqVI09iXKfiDpmw10fSeNyfj7BvE1nkPCVu45q6rHmJjsNJcY
 78FDdJizs2DJmB1KbNIQ8jkJVm9fZvNeGgTnAOrZ1MeZJHa4KqlX36H1ZcyCwu1kQ71VD9bHCHD
 4lspgXMsrGTkzzQotiixK3F1858G5AeD478CGfrxBsKfpLbN4EW9NNMN6J/i9zGElLASBFxxg4Y
 WjLIMPly5Z0Hpvfr/1oJISlmALx+7U404WyQ/CJyHdurWc022zeEsbgGj72nUBP9pf0bgU3JGcL
 kEUpocLF3imR8Ca8CH0+8p5tMqkQ5S0hHnxnAg3+MzvVvWvaP5/hgHtByEhLsM2mX+50EqyqlAW
 7+brGZiwNSm4YFXz6pQR2I24iCMaQY6zV3RVaGa+IQAi49EaFDDVfTaC+NW4obA3fWCLVQVe13R
 0OnLamFeTQ/W5CuVEW2//nqoA464GTR4f5CGbCF/2N3hWY3HjWS4YzTMqhwne7p+4eWvzmGgm2C
 zsFIUBgZHuo+RKw==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

When we will move the whole RX path under the msk socket lock, updating
the already queued skb for passive fastopen socket at 3rd ack time will
be extremely painful and race prone

The map_seq for already enqueued skbs is used only to allow correct
coalescing with later data; preventing collapsing to the first skb of
a fastopen connect we can completely remove the
__mptcp_fastopen_gen_msk_ackseq() helper.

Before dropping this helper, a new item had to be added to the
mptcp_skb_cb structure. Because this item will be frequently tested in
the fast path -- almost on every packet -- and because there is free
space there, a single byte is used instead of a bitfield. This micro
optimisation slightly reduces the number of CPU operations to do the
associated check.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/fastopen.c | 24 ++----------------------
 net/mptcp/protocol.c |  4 +++-
 net/mptcp/protocol.h |  5 ++---
 net/mptcp/subflow.c  |  3 ---
 4 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c
index a29ff901df7588dec24e330ddd77a4aeb1462b68..7777f5a2d14379853fcd13c4b57=
c5569be05a2e4 100644
--- a/net/mptcp/fastopen.c
+++ b/net/mptcp/fastopen.c
@@ -40,13 +40,12 @@ void mptcp_fastopen_subflow_synack_set_params(struct mp=
tcp_subflow_context *subf
 	tp->copied_seq +=3D skb->len;
 	subflow->ssn_offset +=3D skb->len;
=20
-	/* initialize a dummy sequence number, we will update it at MPC
-	 * completion, if needed
-	 */
+	/* Only the sequence delta is relevant */
 	MPTCP_SKB_CB(skb)->map_seq =3D -skb->len;
 	MPTCP_SKB_CB(skb)->end_seq =3D 0;
 	MPTCP_SKB_CB(skb)->offset =3D 0;
 	MPTCP_SKB_CB(skb)->has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp;
+	MPTCP_SKB_CB(skb)->cant_coalesce =3D 1;
=20
 	mptcp_data_lock(sk);
=20
@@ -58,22 +57,3 @@ void mptcp_fastopen_subflow_synack_set_params(struct mpt=
cp_subflow_context *subf
=20
 	mptcp_data_unlock(sk);
 }
-
-void __mptcp_fastopen_gen_msk_ackseq(struct mptcp_sock *msk, struct mptcp_=
subflow_context *subflow,
-				     const struct mptcp_options_received *mp_opt)
-{
-	struct sock *sk =3D (struct sock *)msk;
-	struct sk_buff *skb;
-
-	skb =3D skb_peek_tail(&sk->sk_receive_queue);
-	if (skb) {
-		WARN_ON_ONCE(MPTCP_SKB_CB(skb)->end_seq);
-		pr_debug("msk %p moving seq %llx -> %llx end_seq %llx -> %llx\n", sk,
-			 MPTCP_SKB_CB(skb)->map_seq, MPTCP_SKB_CB(skb)->map_seq + msk->ack_seq,
-			 MPTCP_SKB_CB(skb)->end_seq, MPTCP_SKB_CB(skb)->end_seq + msk->ack_seq);
-		MPTCP_SKB_CB(skb)->map_seq +=3D msk->ack_seq;
-		MPTCP_SKB_CB(skb)->end_seq +=3D msk->ack_seq;
-	}
-
-	pr_debug("msk=3D%p ack_seq=3D%llx\n", msk, msk->ack_seq);
-}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 6bd81904747066d8f2c1043dd81b372925f18cbb..55f9698f3c22f1dc423a7605c7b=
00bfda162b54c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -135,7 +135,8 @@ static bool mptcp_try_coalesce(struct sock *sk, struct =
sk_buff *to,
 	bool fragstolen;
 	int delta;
=20
-	if (MPTCP_SKB_CB(from)->offset ||
+	if (unlikely(MPTCP_SKB_CB(to)->cant_coalesce) ||
+	    MPTCP_SKB_CB(from)->offset ||
 	    ((to->len + from->len) > (sk->sk_rcvbuf >> 3)) ||
 	    !skb_try_coalesce(to, from, &fragstolen, &delta))
 		return false;
@@ -366,6 +367,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st=
ruct sock *ssk,
 	MPTCP_SKB_CB(skb)->end_seq =3D MPTCP_SKB_CB(skb)->map_seq + copy_len;
 	MPTCP_SKB_CB(skb)->offset =3D offset;
 	MPTCP_SKB_CB(skb)->has_rxtstamp =3D has_rxtstamp;
+	MPTCP_SKB_CB(skb)->cant_coalesce =3D 0;
=20
 	if (MPTCP_SKB_CB(skb)->map_seq =3D=3D msk->ack_seq) {
 		/* in sequence */
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 37226cdd9e3717c4f8cf0d4c879a0feaaa91d459..3c3e9b185ae35d92b5a2daae994=
a4a9e76f9cc84 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -129,7 +129,8 @@ struct mptcp_skb_cb {
 	u64 map_seq;
 	u64 end_seq;
 	u32 offset;
-	u8  has_rxtstamp:1;
+	u8  has_rxtstamp;
+	u8  cant_coalesce;
 };
=20
 #define MPTCP_SKB_CB(__skb)	((struct mptcp_skb_cb *)&((__skb)->cb[0]))
@@ -1059,8 +1060,6 @@ void mptcp_event_pm_listener(const struct sock *ssk,
 			     enum mptcp_event_type event);
 bool mptcp_userspace_pm_active(const struct mptcp_sock *msk);
=20
-void __mptcp_fastopen_gen_msk_ackseq(struct mptcp_sock *msk, struct mptcp_=
subflow_context *subflow,
-				     const struct mptcp_options_received *mp_opt);
 void mptcp_fastopen_subflow_synack_set_params(struct mptcp_subflow_context=
 *subflow,
 					      struct request_sock *req);
 int mptcp_nl_fill_addr(struct sk_buff *skb,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 2926bdf88e42c5f2db6875b00b4eca2dbf49dba2..d2caffa56bdd98f5fd9ef07fdcb=
3610ea186b848 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -802,9 +802,6 @@ void __mptcp_subflow_fully_established(struct mptcp_soc=
k *msk,
 	subflow_set_remote_key(msk, subflow, mp_opt);
 	WRITE_ONCE(subflow->fully_established, true);
 	WRITE_ONCE(msk->fully_established, true);
-
-	if (subflow->is_mptfo)
-		__mptcp_fastopen_gen_msk_ackseq(msk, subflow, mp_opt);
 }
=20
 static struct sock *subflow_syn_recv_sock(const struct sock *sk,

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA9391F583C;
	Tue, 18 Feb 2025 18:36:38 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903798; cv=none;
 b=Keoqd5qSi4zOg85e3PNHi917NQEG3Yz5Ldg/ByWy6R71r+Wa0lSYlPKpEEH14cGiG6unyibWK+GSYA1mVHuu2xnw3IXM95SAOAh9o+wvnBAuQb2F+1Lm16hiyNEHDTO/OH4uDJqbeYEMKoyOh1BZCRZl2EEA3YW4z58t1yvFFcs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903798; c=relaxed/simple;
	bh=DyE6gQiuZ9Ci0OJxwDN8nNLr2u0AIT9fOHiDhwzebgY=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=bxC2K5+31vV/HoDcKsGCqg6vn9o6POkcakEfuuScqAzhOajk5q0SmoGQfezx3UsAYYVwJ5IQytE+Ycke/n+TAlnKSoKLRS+8DYCtooW70MvH8cOrj/e2GLtJjHXTbptCkgwX9czu0NLvN13oql7FAd+jCdQLD6iFJIlueU/hdnI=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=Qr0p1No5; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="Qr0p1No5"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id EAF8FC4CEE4;
	Tue, 18 Feb 2025 18:36:34 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903798;
	bh=DyE6gQiuZ9Ci0OJxwDN8nNLr2u0AIT9fOHiDhwzebgY=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=Qr0p1No55a0JyGpnCunEIBqgL/NGu19xt+B6z7yptBJTsdm2CYWLf0psRDuihprAB
	 2mWV8n5Ux3Y+/lpnVPiWAQ9tu3pkgJ8WaZQ+ATVReE1qQCRpS/DQzHMPkrucFdg+p2
	 tx7nbbrC1zNhvDCx4IQxqKQtrBzeQpSNXcY9T5N4beMPpGfJLofC/QB0JU+3n/7GKP
	 YiDg2iYtXtmiQojrdAvt2+puhYxTuLaGyP6AvJ9pUmA4XyXmzQ39XWXqOandc/VgE+
	 ZXGpxS8L9sgqm1451m46McMnhdTwYlp+cXqgLRbTAkSwi1aTgzklyF2Ymq4y7FVgXt
	 TdCKRTzU820uQ==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:14 +0100
Subject: [PATCH net-next 3/7] mptcp: move the whole rx path under msk
 socket lock protection
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-3-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=12219; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=O0n9mGfL/jxpDRwQYQi1erBk5pU+KUYXehVHyztmRzM=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMnEDdLb1X+h3v/kBFaoU6s4/tWXLSV8Xg4d
 yEQBj02vxmJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 c2t/EADGl0VjWzDRF47ez0O+LtwGEpLHyU9uMq7Z8bDHYCzYqO9CnoYdjnWOGgKlxtBRPmt2srZ
 poIG5Ab5FNYzxSVlCfWzdIbOxjWFTwB+Z5ZBYIl3SQ1NTt4vJ1rnjnlvKd4MzmBbuPanlI1HuLd
 J1PuYHTUWzGhFOBjAQplWzl3dCfbe1wEu6N519QfZSp22o1Er4igqD/FUPdFi6sw4saQe3QydZX
 hOqL/sdLV0Ygn1wFsHu76U25gxkXTtv+ABR5LOvP5KDQEhlQww023Mxtm8AWAYr5Z1vJVmfKH+f
 5L0Z2pJlfvyIH3NSRKVFwqOF9YM1p3Kh5JTHm7+BWzDmvJIlhAs1g0QKCCL+n6o+M1/w7L+nQOe
 a6FqGtARa6o25wBb36OXO6k5BNAD6hv/gj86gl8KeH8NBAU9ZvZHPC0H1cuug/t0wiQevL/WSIl
 sr5Ka1YbJ4qwInkteTgwBQg6A49BOHz0ylH2MT/d4T0k0nHszGnEgt5JX6rrIDKMgfmKsLuk0d/
 Q5Z6243RTPMxsSt1OA2yx+V/QgyKu7Cbl7VAMtIfBnBeZxE6xx0ESHzF0qyZ2bkTYrNgjjfg9Yb
 xTjszJKhHvT4+sIUzlpB7+FDWC15bAFBHGXJZcCCU/+yO2owvPesPcCQnUD5SnfSvEyeTriMEvW
 lcNvdLd0z53BGWg==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

After commit c2e6048fa1cf ("mptcp: fix race in release_cb") we can
move the whole MPTCP rx path under the socket lock leveraging the
release_cb.

We can drop a bunch of spin_lock pairs in the receive functions, use
a single receive queue and invoke __mptcp_move_skbs only when subflows
ask for it.

This will allow more cleanup in the next patch.

Some changes are worth specific mention:

The msk rcvbuf update now always happens under both the msk and the
subflow socket lock: we can drop a bunch of ONCE annotation and
consolidate the checks.

When the skbs move is delayed at msk release callback time, even the
msk rcvbuf update is delayed; additionally take care of such action in
__mptcp_move_skbs().

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/fastopen.c |   1 +
 net/mptcp/protocol.c | 123 ++++++++++++++++++++++++-----------------------=
----
 net/mptcp/protocol.h |   2 +-
 3 files changed, 60 insertions(+), 66 deletions(-)

diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c
index 7777f5a2d14379853fcd13c4b57c5569be05a2e4..f85ad19f3dd6c4bcbf31228054c=
cfd30755db5bc 100644
--- a/net/mptcp/fastopen.c
+++ b/net/mptcp/fastopen.c
@@ -48,6 +48,7 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptc=
p_subflow_context *subf
 	MPTCP_SKB_CB(skb)->cant_coalesce =3D 1;
=20
 	mptcp_data_lock(sk);
+	DEBUG_NET_WARN_ON_ONCE(sock_owned_by_user_nocheck(sk));
=20
 	mptcp_set_owner_r(skb, sk);
 	__skb_queue_tail(&sk->sk_receive_queue, skb);
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 55f9698f3c22f1dc423a7605c7b00bfda162b54c..8bdc7a7a58f31ac74d6a2156b22=
97af9cd90c635 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -645,18 +645,6 @@ static bool __mptcp_move_skbs_from_subflow(struct mptc=
p_sock *msk,
 	bool more_data_avail;
 	struct tcp_sock *tp;
 	bool done =3D false;
-	int sk_rbuf;
-
-	sk_rbuf =3D READ_ONCE(sk->sk_rcvbuf);
-
-	if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
-		int ssk_rbuf =3D READ_ONCE(ssk->sk_rcvbuf);
-
-		if (unlikely(ssk_rbuf > sk_rbuf)) {
-			WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf);
-			sk_rbuf =3D ssk_rbuf;
-		}
-	}
=20
 	pr_debug("msk=3D%p ssk=3D%p\n", msk, ssk);
 	tp =3D tcp_sk(ssk);
@@ -724,7 +712,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp=
_sock *msk,
 		WRITE_ONCE(tp->copied_seq, seq);
 		more_data_avail =3D mptcp_subflow_data_available(ssk);
=20
-		if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) {
+		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) {
 			done =3D true;
 			break;
 		}
@@ -848,11 +836,30 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, =
struct sock *ssk)
 	return moved > 0;
 }
=20
+static void __mptcp_rcvbuf_update(struct sock *sk, struct sock *ssk)
+{
+	if (unlikely(ssk->sk_rcvbuf > sk->sk_rcvbuf))
+		WRITE_ONCE(sk->sk_rcvbuf, ssk->sk_rcvbuf);
+}
+
+static void __mptcp_data_ready(struct sock *sk, struct sock *ssk)
+{
+	struct mptcp_sock *msk =3D mptcp_sk(sk);
+
+	__mptcp_rcvbuf_update(sk, ssk);
+
+	/* over limit? can't append more skbs to msk, Also, no need to wake-up*/
+	if (__mptcp_rmem(sk) > sk->sk_rcvbuf)
+		return;
+
+	/* Wake-up the reader only for in-sequence data */
+	if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk))
+		sk->sk_data_ready(sk);
+}
+
 void mptcp_data_ready(struct sock *sk, struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk);
-	struct mptcp_sock *msk =3D mptcp_sk(sk);
-	int sk_rbuf, ssk_rbuf;
=20
 	/* The peer can send data while we are shutting down this
 	 * subflow at msk destruction time, but we must avoid enqueuing
@@ -861,19 +868,11 @@ void mptcp_data_ready(struct sock *sk, struct sock *s=
sk)
 	if (unlikely(subflow->disposable))
 		return;
=20
-	ssk_rbuf =3D READ_ONCE(ssk->sk_rcvbuf);
-	sk_rbuf =3D READ_ONCE(sk->sk_rcvbuf);
-	if (unlikely(ssk_rbuf > sk_rbuf))
-		sk_rbuf =3D ssk_rbuf;
-
-	/* over limit? can't append more skbs to msk, Also, no need to wake-up*/
-	if (__mptcp_rmem(sk) > sk_rbuf)
-		return;
-
-	/* Wake-up the reader only for in-sequence data */
 	mptcp_data_lock(sk);
-	if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk))
-		sk->sk_data_ready(sk);
+	if (!sock_owned_by_user(sk))
+		__mptcp_data_ready(sk, ssk);
+	else
+		__set_bit(MPTCP_DEQUEUE, &mptcp_sk(sk)->cb_flags);
 	mptcp_data_unlock(sk);
 }
=20
@@ -1946,16 +1945,17 @@ static int mptcp_sendmsg(struct sock *sk, struct ms=
ghdr *msg, size_t len)
=20
 static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied);
=20
-static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk,
+static int __mptcp_recvmsg_mskq(struct sock *sk,
 				struct msghdr *msg,
 				size_t len, int flags,
 				struct scm_timestamping_internal *tss,
 				int *cmsg_flags)
 {
+	struct mptcp_sock *msk =3D mptcp_sk(sk);
 	struct sk_buff *skb, *tmp;
 	int copied =3D 0;
=20
-	skb_queue_walk_safe(&msk->receive_queue, skb, tmp) {
+	skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp) {
 		u32 offset =3D MPTCP_SKB_CB(skb)->offset;
 		u32 data_len =3D skb->len - offset;
 		u32 count =3D min_t(size_t, len - copied, data_len);
@@ -1990,7 +1990,7 @@ static int __mptcp_recvmsg_mskq(struct mptcp_sock *ms=
k,
 			/* we will bulk release the skb memory later */
 			skb->destructor =3D NULL;
 			WRITE_ONCE(msk->rmem_released, msk->rmem_released + skb->truesize);
-			__skb_unlink(skb, &msk->receive_queue);
+			__skb_unlink(skb, &sk->sk_receive_queue);
 			__kfree_skb(skb);
 			msk->bytes_consumed +=3D count;
 		}
@@ -2115,54 +2115,46 @@ static void __mptcp_update_rmem(struct sock *sk)
 	WRITE_ONCE(msk->rmem_released, 0);
 }
=20
-static void __mptcp_splice_receive_queue(struct sock *sk)
+static bool __mptcp_move_skbs(struct sock *sk)
 {
+	struct mptcp_subflow_context *subflow;
 	struct mptcp_sock *msk =3D mptcp_sk(sk);
-
-	skb_queue_splice_tail_init(&sk->sk_receive_queue, &msk->receive_queue);
-}
-
-static bool __mptcp_move_skbs(struct mptcp_sock *msk)
-{
-	struct sock *sk =3D (struct sock *)msk;
 	unsigned int moved =3D 0;
 	bool ret, done;
=20
+	/* verify we can move any data from the subflow, eventually updating */
+	if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
+		mptcp_for_each_subflow(msk, subflow)
+			__mptcp_rcvbuf_update(sk, subflow->tcp_sock);
+
+	if (__mptcp_rmem(sk) > sk->sk_rcvbuf)
+		return false;
+
 	do {
 		struct sock *ssk =3D mptcp_subflow_recv_lookup(msk);
 		bool slowpath;
=20
-		/* we can have data pending in the subflows only if the msk
-		 * receive buffer was full at subflow_data_ready() time,
-		 * that is an unlikely slow path.
-		 */
-		if (likely(!ssk))
+		if (unlikely(!ssk))
 			break;
=20
 		slowpath =3D lock_sock_fast(ssk);
-		mptcp_data_lock(sk);
 		__mptcp_update_rmem(sk);
 		done =3D __mptcp_move_skbs_from_subflow(msk, ssk, &moved);
-		mptcp_data_unlock(sk);
=20
 		if (unlikely(ssk->sk_err))
 			__mptcp_error_report(sk);
 		unlock_sock_fast(ssk, slowpath);
 	} while (!done);
=20
-	/* acquire the data lock only if some input data is pending */
 	ret =3D moved > 0;
 	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
-	    !skb_queue_empty_lockless(&sk->sk_receive_queue)) {
-		mptcp_data_lock(sk);
+	    !skb_queue_empty(&sk->sk_receive_queue)) {
 		__mptcp_update_rmem(sk);
 		ret |=3D __mptcp_ofo_queue(msk);
-		__mptcp_splice_receive_queue(sk);
-		mptcp_data_unlock(sk);
 	}
 	if (ret)
 		mptcp_check_data_fin((struct sock *)msk);
-	return !skb_queue_empty(&msk->receive_queue);
+	return ret;
 }
=20
 static unsigned int mptcp_inq_hint(const struct sock *sk)
@@ -2170,7 +2162,7 @@ static unsigned int mptcp_inq_hint(const struct sock =
*sk)
 	const struct mptcp_sock *msk =3D mptcp_sk(sk);
 	const struct sk_buff *skb;
=20
-	skb =3D skb_peek(&msk->receive_queue);
+	skb =3D skb_peek(&sk->sk_receive_queue);
 	if (skb) {
 		u64 hint_val =3D READ_ONCE(msk->ack_seq) - MPTCP_SKB_CB(skb)->map_seq;
=20
@@ -2216,7 +2208,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh=
dr *msg, size_t len,
 	while (copied < len) {
 		int err, bytes_read;
=20
-		bytes_read =3D __mptcp_recvmsg_mskq(msk, msg, len - copied, flags, &tss,=
 &cmsg_flags);
+		bytes_read =3D __mptcp_recvmsg_mskq(sk, msg, len - copied, flags, &tss, =
&cmsg_flags);
 		if (unlikely(bytes_read < 0)) {
 			if (!copied)
 				copied =3D bytes_read;
@@ -2225,7 +2217,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh=
dr *msg, size_t len,
=20
 		copied +=3D bytes_read;
=20
-		if (skb_queue_empty(&msk->receive_queue) && __mptcp_move_skbs(msk))
+		if (skb_queue_empty(&sk->sk_receive_queue) && __mptcp_move_skbs(sk))
 			continue;
=20
 		/* only the MPTCP socket status is relevant here. The exit
@@ -2251,7 +2243,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh=
dr *msg, size_t len,
 				/* race breaker: the shutdown could be after the
 				 * previous receive queue check
 				 */
-				if (__mptcp_move_skbs(msk))
+				if (__mptcp_move_skbs(sk))
 					continue;
 				break;
 			}
@@ -2295,9 +2287,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh=
dr *msg, size_t len,
 		}
 	}
=20
-	pr_debug("msk=3D%p rx queue empty=3D%d:%d copied=3D%d\n",
-		 msk, skb_queue_empty_lockless(&sk->sk_receive_queue),
-		 skb_queue_empty(&msk->receive_queue), copied);
+	pr_debug("msk=3D%p rx queue empty=3D%d copied=3D%d\n",
+		 msk, skb_queue_empty(&sk->sk_receive_queue), copied);
=20
 	release_sock(sk);
 	return copied;
@@ -2824,7 +2815,6 @@ static void __mptcp_init_sock(struct sock *sk)
 	INIT_LIST_HEAD(&msk->join_list);
 	INIT_LIST_HEAD(&msk->rtx_queue);
 	INIT_WORK(&msk->work, mptcp_worker);
-	__skb_queue_head_init(&msk->receive_queue);
 	msk->out_of_order_queue =3D RB_ROOT;
 	msk->first_pending =3D NULL;
 	WRITE_ONCE(msk->rmem_fwd_alloc, 0);
@@ -3407,12 +3397,8 @@ void mptcp_destroy_common(struct mptcp_sock *msk, un=
signed int flags)
 	mptcp_for_each_subflow_safe(msk, subflow, tmp)
 		__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags);
=20
-	/* move to sk_receive_queue, sk_stream_kill_queues will purge it */
-	mptcp_data_lock(sk);
-	skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue);
 	__skb_queue_purge(&sk->sk_receive_queue);
 	skb_rbtree_purge(&msk->out_of_order_queue);
-	mptcp_data_unlock(sk);
=20
 	/* move all the rx fwd alloc into the sk_mem_reclaim_final in
 	 * inet_sock_destruct() will dispose it
@@ -3455,7 +3441,8 @@ void __mptcp_check_push(struct sock *sk, struct sock =
*ssk)
=20
 #define MPTCP_FLAGS_PROCESS_CTX_NEED (BIT(MPTCP_PUSH_PENDING) | \
 				      BIT(MPTCP_RETRANSMIT) | \
-				      BIT(MPTCP_FLUSH_JOIN_LIST))
+				      BIT(MPTCP_FLUSH_JOIN_LIST) | \
+				      BIT(MPTCP_DEQUEUE))
=20
 /* processes deferred events and flush wmem */
 static void mptcp_release_cb(struct sock *sk)
@@ -3489,6 +3476,11 @@ static void mptcp_release_cb(struct sock *sk)
 			__mptcp_push_pending(sk, 0);
 		if (flags & BIT(MPTCP_RETRANSMIT))
 			__mptcp_retrans(sk);
+		if ((flags & BIT(MPTCP_DEQUEUE)) && __mptcp_move_skbs(sk)) {
+			/* notify ack seq update */
+			mptcp_cleanup_rbuf(msk, 0);
+			sk->sk_data_ready(sk);
+		}
=20
 		cond_resched();
 		spin_lock_bh(&sk->sk_lock.slock);
@@ -3726,7 +3718,8 @@ static int mptcp_ioctl(struct sock *sk, int cmd, int =
*karg)
 			return -EINVAL;
=20
 		lock_sock(sk);
-		__mptcp_move_skbs(msk);
+		if (__mptcp_move_skbs(sk))
+			mptcp_cleanup_rbuf(msk, 0);
 		*karg =3D mptcp_inq_hint(sk);
 		release_sock(sk);
 		break;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 3c3e9b185ae35d92b5a2daae994a4a9e76f9cc84..753456b73f90879126a36964924=
d2b6e08e2a1cc 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -124,6 +124,7 @@
 #define MPTCP_FLUSH_JOIN_LIST	5
 #define MPTCP_SYNC_STATE	6
 #define MPTCP_SYNC_SNDBUF	7
+#define MPTCP_DEQUEUE		8
=20
 struct mptcp_skb_cb {
 	u64 map_seq;
@@ -325,7 +326,6 @@ struct mptcp_sock {
 	struct work_struct work;
 	struct sk_buff  *ooo_last_skb;
 	struct rb_root  out_of_order_queue;
-	struct sk_buff_head receive_queue;
 	struct list_head conn_list;
 	struct list_head rtx_queue;
 	struct mptcp_data_frag *first_pending;

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B38C1F585B;
	Tue, 18 Feb 2025 18:36:41 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903802; cv=none;
 b=Hb0UZlQFT7/m24YL6+mGS/aR35JSOeuCYHT5KlHhq3sCXxb9IoaVg75FClwb4XigN7A+ZE4ar9FWOrjkdrDrhCWtC6tmk/aQJ51n1C3NdBQwyfsCo9qZDH3XoDWHotJ9oUXwTrKEJUVnmH5C4RaAuuuI8fyLP/cvloipO2KWJO8=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903802; c=relaxed/simple;
	bh=OoEmchlQBBotC5o70iyC1CwCPNurWZ6JNiqtCzdY7Ws=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=lypTP0a8OQ1htomv4YkBqHsAYvlMGwlWDSAQCCzY/igTOSpidLqKOgyPbAzSNzNqlBTodsIEjtredo9M10Prvyra/3OmGFXn1Bg9Y3CvkILYUgMlfjEXpTg97hv2SH5ZDkBDtwQ6240lfgZ+c/mx65/OqnHbTME5CNT2Lwjv7Hc=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=FmXOXYgG; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="FmXOXYgG"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 973B8C4CEE2;
	Tue, 18 Feb 2025 18:36:38 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903801;
	bh=OoEmchlQBBotC5o70iyC1CwCPNurWZ6JNiqtCzdY7Ws=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=FmXOXYgG5F/9gWkBrgA7YhMb8jEVRbVFSQB5Gx+ffUjJIA4Q8bJ7ofwhpj7hTvFpV
	 GNqTmM7LnivKyszB5CT4sBYJnSxRGjm1MCQjmxuzC+jehahjw4cKrAe1SyQBcIQWbj
	 TDVDTsoU4j564dDlY7owPNGRdcEurpTfvnIXAyzAbjtPLeShYO8QKnM0vSsVlQKU88
	 WPdb8CntBKaOu684Z6CFStRGhH95RT6mWjykw4P9vSckqNRrPmySwuHQzc8CpqKiNe
	 uHblHAfrixb0WkCQusX5BJF0ZutYW5JDyiXsNwwNWkga4bcmiuLj3N0q1l1cEASvRN
	 h2lPggPpvYQfw==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:15 +0100
Subject: [PATCH net-next 4/7] mptcp: cleanup mem accounting
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-4-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=9614; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=nH6TbZ5V5qQ8waWiktorcZY2CmpT15tov8l5vC4mv4w=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMnJOFs2Jj+z1+zpnXC4sNW8qFmbPHS4cKev
 9Ihdti/jfSJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 cwAkEADOPpslPpgdDoDz+9BQ2gZLK5XioxOiTRRCO/fNb1YX/gSwiS5O/Sbwq6ECQw79B+H3Y7z
 KV22bDUv0lL98m1971Rm/USGLmoEcaYBs0bhlIUGAq+Rr1+DOZG64o6p34W3qgv05oxdxAelLgn
 2sQ1IOEelqfQjm4qk6M6/mx4zndcAZF56XCpxH6YzFS4hp8CfQDAnEqgLQ8vBIASWbwW02IeSX7
 ySx8xxSiWKiFfOAoBmya6Xr2ifG0Vj+OXKyZiPo9oQqt8nUv/sZFFWEdqryYNM/Uin4SgO2oGcg
 Cjl9cd1bzs2gMo2FmOV+6UMIHsUef56Iz9UmHgH5SHGvrWWVTOJJOL1IhT56pK/aSIx1WslZg3o
 veudCoFr3TiAs8SEWr/y3m3AnncD1HT9aCcfBDC2BTpwt7myiD2Nsq2eaQXfDwEsx03yWnNhqLZ
 aF5YQEpeoKywzpuNZz0C8dpJIbCCLYcAmr8EdEEo+vyayyY8eQ9ZSs3Faak+aEvzBIDputXPMJe
 Cj/5ivwOU4FwSteTge5i546ZbyT6ouMGp5TqJJJJP3+uS+G6+lk7jHyRlR/+Ahey8EuOTXTUD52
 aBYrMa16xGoY2lvR4Artn18Tzs1jBsPw0HzZrd7NyDfvlGoEVLQ07PDASkDzSvS4ERyJo8IUBFu
 MwPax0mduj+eIlw==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

After the previous patch, updating sk_forward_memory is cheap and
we can drop a lot of complexity from the MPTCP memory accounting,
removing the custom fwd mem allocations for rmem.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/fastopen.c |   2 +-
 net/mptcp/protocol.c | 115 ++++-------------------------------------------=
----
 net/mptcp/protocol.h |   4 +-
 3 files changed, 10 insertions(+), 111 deletions(-)

diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c
index f85ad19f3dd6c4bcbf31228054ccfd30755db5bc..b9e4511979028c10d232efbcaca=
68400fc4f2e7a 100644
--- a/net/mptcp/fastopen.c
+++ b/net/mptcp/fastopen.c
@@ -50,7 +50,7 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptc=
p_subflow_context *subf
 	mptcp_data_lock(sk);
 	DEBUG_NET_WARN_ON_ONCE(sock_owned_by_user_nocheck(sk));
=20
-	mptcp_set_owner_r(skb, sk);
+	skb_set_owner_r(skb, sk);
 	__skb_queue_tail(&sk->sk_receive_queue, skb);
 	mptcp_sk(sk)->bytes_received +=3D skb->len;
=20
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 8bdc7a7a58f31ac74d6a2156b2297af9cd90c635..080877f8daf7e3ff36531f3e110=
79d2163676f2d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -118,17 +118,6 @@ static void mptcp_drop(struct sock *sk, struct sk_buff=
 *skb)
 	__kfree_skb(skb);
 }
=20
-static void mptcp_rmem_fwd_alloc_add(struct sock *sk, int size)
-{
-	WRITE_ONCE(mptcp_sk(sk)->rmem_fwd_alloc,
-		   mptcp_sk(sk)->rmem_fwd_alloc + size);
-}
-
-static void mptcp_rmem_charge(struct sock *sk, int size)
-{
-	mptcp_rmem_fwd_alloc_add(sk, -size);
-}
-
 static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to,
 			       struct sk_buff *from)
 {
@@ -151,7 +140,7 @@ static bool mptcp_try_coalesce(struct sock *sk, struct =
sk_buff *to,
 	 * negative one
 	 */
 	atomic_add(delta, &sk->sk_rmem_alloc);
-	mptcp_rmem_charge(sk, delta);
+	sk_mem_charge(sk, delta);
 	kfree_skb_partial(from, fragstolen);
=20
 	return true;
@@ -166,44 +155,6 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *=
msk, struct sk_buff *to,
 	return mptcp_try_coalesce((struct sock *)msk, to, from);
 }
=20
-static void __mptcp_rmem_reclaim(struct sock *sk, int amount)
-{
-	amount >>=3D PAGE_SHIFT;
-	mptcp_rmem_charge(sk, amount << PAGE_SHIFT);
-	__sk_mem_reduce_allocated(sk, amount);
-}
-
-static void mptcp_rmem_uncharge(struct sock *sk, int size)
-{
-	struct mptcp_sock *msk =3D mptcp_sk(sk);
-	int reclaimable;
-
-	mptcp_rmem_fwd_alloc_add(sk, size);
-	reclaimable =3D msk->rmem_fwd_alloc - sk_unused_reserved_mem(sk);
-
-	/* see sk_mem_uncharge() for the rationale behind the following schema */
-	if (unlikely(reclaimable >=3D PAGE_SIZE))
-		__mptcp_rmem_reclaim(sk, reclaimable);
-}
-
-static void mptcp_rfree(struct sk_buff *skb)
-{
-	unsigned int len =3D skb->truesize;
-	struct sock *sk =3D skb->sk;
-
-	atomic_sub(len, &sk->sk_rmem_alloc);
-	mptcp_rmem_uncharge(sk, len);
-}
-
-void mptcp_set_owner_r(struct sk_buff *skb, struct sock *sk)
-{
-	skb_orphan(skb);
-	skb->sk =3D sk;
-	skb->destructor =3D mptcp_rfree;
-	atomic_add(skb->truesize, &sk->sk_rmem_alloc);
-	mptcp_rmem_charge(sk, skb->truesize);
-}
-
 /* "inspired" by tcp_data_queue_ofo(), main differences:
  * - use mptcp seqs
  * - don't cope with sacks
@@ -316,25 +267,7 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *ms=
k, struct sk_buff *skb)
=20
 end:
 	skb_condense(skb);
-	mptcp_set_owner_r(skb, sk);
-}
-
-static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int siz=
e)
-{
-	struct mptcp_sock *msk =3D mptcp_sk(sk);
-	int amt, amount;
-
-	if (size <=3D msk->rmem_fwd_alloc)
-		return true;
-
-	size -=3D msk->rmem_fwd_alloc;
-	amt =3D sk_mem_pages(size);
-	amount =3D amt << PAGE_SHIFT;
-	if (!__sk_mem_raise_allocated(sk, size, amt, SK_MEM_RECV))
-		return false;
-
-	mptcp_rmem_fwd_alloc_add(sk, amount);
-	return true;
+	skb_set_owner_r(skb, sk);
 }
=20
 static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk,
@@ -352,7 +285,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st=
ruct sock *ssk,
 	skb_orphan(skb);
=20
 	/* try to fetch required memory from subflow */
-	if (!mptcp_rmem_schedule(sk, ssk, skb->truesize)) {
+	if (!sk_rmem_schedule(sk, skb, skb->truesize)) {
 		MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED);
 		goto drop;
 	}
@@ -377,7 +310,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st=
ruct sock *ssk,
 		if (tail && mptcp_try_coalesce(sk, tail, skb))
 			return true;
=20
-		mptcp_set_owner_r(skb, sk);
+		skb_set_owner_r(skb, sk);
 		__skb_queue_tail(&sk->sk_receive_queue, skb);
 		return true;
 	} else if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) {
@@ -1987,9 +1920,10 @@ static int __mptcp_recvmsg_mskq(struct sock *sk,
 		}
=20
 		if (!(flags & MSG_PEEK)) {
-			/* we will bulk release the skb memory later */
+			/* avoid the indirect call, we know the destructor is sock_wfree */
 			skb->destructor =3D NULL;
-			WRITE_ONCE(msk->rmem_released, msk->rmem_released + skb->truesize);
+			atomic_sub(skb->truesize, &sk->sk_rmem_alloc);
+			sk_mem_uncharge(sk, skb->truesize);
 			__skb_unlink(skb, &sk->sk_receive_queue);
 			__kfree_skb(skb);
 			msk->bytes_consumed +=3D count;
@@ -2103,18 +2037,6 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock=
 *msk, int copied)
 	msk->rcvq_space.time =3D mstamp;
 }
=20
-static void __mptcp_update_rmem(struct sock *sk)
-{
-	struct mptcp_sock *msk =3D mptcp_sk(sk);
-
-	if (!msk->rmem_released)
-		return;
-
-	atomic_sub(msk->rmem_released, &sk->sk_rmem_alloc);
-	mptcp_rmem_uncharge(sk, msk->rmem_released);
-	WRITE_ONCE(msk->rmem_released, 0);
-}
-
 static bool __mptcp_move_skbs(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow;
@@ -2138,7 +2060,6 @@ static bool __mptcp_move_skbs(struct sock *sk)
 			break;
=20
 		slowpath =3D lock_sock_fast(ssk);
-		__mptcp_update_rmem(sk);
 		done =3D __mptcp_move_skbs_from_subflow(msk, ssk, &moved);
=20
 		if (unlikely(ssk->sk_err))
@@ -2146,12 +2067,7 @@ static bool __mptcp_move_skbs(struct sock *sk)
 		unlock_sock_fast(ssk, slowpath);
 	} while (!done);
=20
-	ret =3D moved > 0;
-	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
-	    !skb_queue_empty(&sk->sk_receive_queue)) {
-		__mptcp_update_rmem(sk);
-		ret |=3D __mptcp_ofo_queue(msk);
-	}
+	ret =3D moved > 0 || __mptcp_ofo_queue(msk);
 	if (ret)
 		mptcp_check_data_fin((struct sock *)msk);
 	return ret;
@@ -2817,8 +2733,6 @@ static void __mptcp_init_sock(struct sock *sk)
 	INIT_WORK(&msk->work, mptcp_worker);
 	msk->out_of_order_queue =3D RB_ROOT;
 	msk->first_pending =3D NULL;
-	WRITE_ONCE(msk->rmem_fwd_alloc, 0);
-	WRITE_ONCE(msk->rmem_released, 0);
 	msk->timer_ival =3D TCP_RTO_MIN;
 	msk->scaling_ratio =3D TCP_DEFAULT_SCALING_RATIO;
=20
@@ -3044,8 +2958,6 @@ static void __mptcp_destroy_sock(struct sock *sk)
=20
 	sk->sk_prot->destroy(sk);
=20
-	WARN_ON_ONCE(READ_ONCE(msk->rmem_fwd_alloc));
-	WARN_ON_ONCE(msk->rmem_released);
 	sk_stream_kill_queues(sk);
 	xfrm_sk_free_policy(sk);
=20
@@ -3403,8 +3315,6 @@ void mptcp_destroy_common(struct mptcp_sock *msk, uns=
igned int flags)
 	/* move all the rx fwd alloc into the sk_mem_reclaim_final in
 	 * inet_sock_destruct() will dispose it
 	 */
-	sk_forward_alloc_add(sk, msk->rmem_fwd_alloc);
-	WRITE_ONCE(msk->rmem_fwd_alloc, 0);
 	mptcp_token_destroy(msk);
 	mptcp_pm_free_anno_list(msk);
 	mptcp_free_local_addr_list(msk);
@@ -3500,8 +3410,6 @@ static void mptcp_release_cb(struct sock *sk)
 		if (__test_and_clear_bit(MPTCP_SYNC_SNDBUF, &msk->cb_flags))
 			__mptcp_sync_sndbuf(sk);
 	}
-
-	__mptcp_update_rmem(sk);
 }
=20
 /* MP_JOIN client subflow must wait for 4th ack before sending any data:
@@ -3672,12 +3580,6 @@ static void mptcp_shutdown(struct sock *sk, int how)
 		__mptcp_wr_shutdown(sk);
 }
=20
-static int mptcp_forward_alloc_get(const struct sock *sk)
-{
-	return READ_ONCE(sk->sk_forward_alloc) +
-	       READ_ONCE(mptcp_sk(sk)->rmem_fwd_alloc);
-}
-
 static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v)
 {
 	const struct sock *sk =3D (void *)msk;
@@ -3836,7 +3738,6 @@ static struct proto mptcp_prot =3D {
 	.hash		=3D mptcp_hash,
 	.unhash		=3D mptcp_unhash,
 	.get_port	=3D mptcp_get_port,
-	.forward_alloc_get	=3D mptcp_forward_alloc_get,
 	.stream_memory_free	=3D mptcp_stream_memory_free,
 	.sockets_allocated	=3D &mptcp_sockets_allocated,
=20
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 753456b73f90879126a36964924d2b6e08e2a1cc..613d556ed938a99a2800b4384ee=
4c6cda9483381 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -281,7 +281,6 @@ struct mptcp_sock {
 	u64		rcv_data_fin_seq;
 	u64		bytes_retrans;
 	u64		bytes_consumed;
-	int		rmem_fwd_alloc;
 	int		snd_burst;
 	int		old_wspace;
 	u64		recovery_snd_nxt;	/* in recovery mode accept up to this seq;
@@ -296,7 +295,6 @@ struct mptcp_sock {
 	u32		last_ack_recv;
 	unsigned long	timer_ival;
 	u32		token;
-	int		rmem_released;
 	unsigned long	flags;
 	unsigned long	cb_flags;
 	bool		recovery;		/* closing subflow write queue reinjected */
@@ -387,7 +385,7 @@ static inline void msk_owned_by_me(const struct mptcp_s=
ock *msk)
  */
 static inline int __mptcp_rmem(const struct sock *sk)
 {
-	return atomic_read(&sk->sk_rmem_alloc) - READ_ONCE(mptcp_sk(sk)->rmem_rel=
eased);
+	return atomic_read(&sk->sk_rmem_alloc);
 }
=20
 static inline int mptcp_win_from_space(const struct sock *sk, int space)

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96DB726B2D3;
	Tue, 18 Feb 2025 18:36:45 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903805; cv=none;
 b=dAaBdk29YyM9KVDqUT+81cfLk3DJ8pL8XU6xjT3+2Zc99ZunfWgx3erh4b4an2cAAbcrhCioc9lwqXh3RcUrI9mWWNojUk99mfv5VFvQvnG4xJMfvvuonyNQeY31kK3UMx/YHE4pL058O07a9V1aBWqXhv6FYW4PTbXiC2QTsoI=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903805; c=relaxed/simple;
	bh=a2n1iRP+j6iFehN5PBuGMpY43pwjb/WhoWJ7Nx56u8Q=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=g0MD9TSPcQqmkY9caPwCPMEarSUKy2ocL7M7NmItJfvOcyCGUXFNiiARox0N62W2G77n9XZvPb/5/HdIUZQxdEDuRuh7pTEYyk1hblIMUHsVOcKsflygoRuAMR6UtJzB9iXltL3j/9VG+w0b9HCkjKvi9jXAqv9Sj5kamSHv3T8=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=WKs+BKHb; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="WKs+BKHb"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3FCF8C4CEE4;
	Tue, 18 Feb 2025 18:36:42 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903805;
	bh=a2n1iRP+j6iFehN5PBuGMpY43pwjb/WhoWJ7Nx56u8Q=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=WKs+BKHbuj7T4KehhFAsemBS9FoQrVFyGcFLIBh6gTLvvt/C7b+h0yc+ivlDX4fK3
	 73ME2Fhv4n8jUEvLzIwgpvKdemviZwsikf1hBA4N44uzEZahOAjjXR5lHllGZfp57b
	 xtml2Rdgbl5tBWPCmJL0+40kkKmWhUQt00BNKhIa9f4r2bju/W38dIrZO4QcjPEdiL
	 ZbY8dV6Ika6wQ4bZ0TYxEo3G+8bsJWbqmDrhPSY9d8FZPqi3r/Idj3TTHO1k2l68I5
	 DHLm/ms9yPrFWngzGl1HxQ5050Re88HJu+qPk7rjQsJgk8TBgNLMmK0Y/bmn5u5O9I
	 HEaNkNAtWpRRQ==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:16 +0100
Subject: [PATCH net-next 5/7] net: dismiss sk_forward_alloc_get()
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-5-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=4080; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=viiaOyZW64iyCHKnCj0hoDEEHUJ4EIG7vZ9h8lhf3Gk=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMnYA3KhIc1NRpfdciZGIALMCW6GMTMW20Z2
 V3Hu/DXVT2JAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 cw4FD/4ttVC+hSUqJBWuS+exx/4G58u/1diepZ3yK0/xbX2eg6hCflT/KNpzWouEyINnwZph4L0
 4EhKU1109fFZSphu7/aGh2nQwfJ+qQB4PdEZAUZTPL0TOaTxoKRKn9sAch3vzeUHN3+lo+eO/fz
 TLQthg5ZJV/pIBh84aWweCaYSSwEMVjpF7IRFo41exV5UdlctHpMVZX4CxQd5qEdx7+e+ywD2nf
 Ur25vuB2je4eCMXN71sSTccReKW9NVJJxxSua4hAkXJGCXQDgFd8ShlKJdUyZFbSeU0Nnhz1+jp
 pcIsMak4veyWLXw/tDFmWTXACkH0RIpM9ZtsI2XhfQbAMygvXOolh+Ze1jt5mWFzb3NTF2vW0N1
 VeA5VLy7PA1JJw6d/fdxeB45y0uXwf5DkBHBoscZOTeDIQbgbrw9v9Dyb2zCuDlpEpJK1OMq/SW
 YbxJpSOaA0hu7f9jvS94wlyQrgfFUaNlawei8EJ9pQBRbAUBnzJnQk8wJcNUdthERLggW+1lKWV
 wyjpFoIOg5s7XIxYihXrYVtd6NqrHKit8kR3xTTi/CuKtFjeQfLUAmVYtO1bsW+84ZnorwmFG+y
 qQg/ilNkE1EyjHgSd807RQ7IGzF1Gelivm56TRBbk5x6qoFDQizDscz1mydzC1I8wLEGYDE8D87
 nzrJ7Z5XWVHBP0g==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

After the previous patch we can remove the forward_alloc_get
proto callback, basically reverting commit 292e6077b040 ("net: introduce
sk_forward_alloc_get()") and commit 66d58f046c9d ("net: use
sk_forward_alloc_get() in sk_get_meminfo()").

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 include/net/sock.h   | 13 -------------
 net/core/sock.c      |  2 +-
 net/ipv4/af_inet.c   |  2 +-
 net/ipv4/inet_diag.c |  2 +-
 net/sched/em_meta.c  |  2 +-
 5 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 60ebf3c7b229e257b164e0de1f56543ea69f38f3..ac7fb5bd8ef9af10135a6e70340=
8f2b24bd3d713 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1285,10 +1285,6 @@ struct proto {
 	unsigned int		inuse_idx;
 #endif
=20
-#if IS_ENABLED(CONFIG_MPTCP)
-	int			(*forward_alloc_get)(const struct sock *sk);
-#endif
-
 	bool			(*stream_memory_free)(const struct sock *sk, int wake);
 	bool			(*sock_is_readable)(struct sock *sk);
 	/* Memory pressure */
@@ -1349,15 +1345,6 @@ int sock_load_diag_module(int family, int protocol);
=20
 INDIRECT_CALLABLE_DECLARE(bool tcp_stream_memory_free(const struct sock *s=
k, int wake));
=20
-static inline int sk_forward_alloc_get(const struct sock *sk)
-{
-#if IS_ENABLED(CONFIG_MPTCP)
-	if (sk->sk_prot->forward_alloc_get)
-		return sk->sk_prot->forward_alloc_get(sk);
-#endif
-	return READ_ONCE(sk->sk_forward_alloc);
-}
-
 static inline bool __sk_stream_memory_free(const struct sock *sk, int wake)
 {
 	if (READ_ONCE(sk->sk_wmem_queued) >=3D READ_ONCE(sk->sk_sndbuf))
diff --git a/net/core/sock.c b/net/core/sock.c
index 53c7af0038c4fca630e1ac2ebecf55558cb16eef..0d385bf27b38d97458e6a695a55=
9f4f1600773c4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3882,7 +3882,7 @@ void sk_get_meminfo(const struct sock *sk, u32 *mem)
 	mem[SK_MEMINFO_RCVBUF] =3D READ_ONCE(sk->sk_rcvbuf);
 	mem[SK_MEMINFO_WMEM_ALLOC] =3D sk_wmem_alloc_get(sk);
 	mem[SK_MEMINFO_SNDBUF] =3D READ_ONCE(sk->sk_sndbuf);
-	mem[SK_MEMINFO_FWD_ALLOC] =3D sk_forward_alloc_get(sk);
+	mem[SK_MEMINFO_FWD_ALLOC] =3D READ_ONCE(sk->sk_forward_alloc);
 	mem[SK_MEMINFO_WMEM_QUEUED] =3D READ_ONCE(sk->sk_wmem_queued);
 	mem[SK_MEMINFO_OPTMEM] =3D atomic_read(&sk->sk_omem_alloc);
 	mem[SK_MEMINFO_BACKLOG] =3D READ_ONCE(sk->sk_backlog.len);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 21f46ee7b6e95329a2f7f0e0429eebf1648e7f9d..5df1f1325259d9b9dbe3be19a81=
066f85cf306e5 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -153,7 +153,7 @@ void inet_sock_destruct(struct sock *sk)
 	WARN_ON_ONCE(atomic_read(&sk->sk_rmem_alloc));
 	WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc));
 	WARN_ON_ONCE(sk->sk_wmem_queued);
-	WARN_ON_ONCE(sk_forward_alloc_get(sk));
+	WARN_ON_ONCE(sk->sk_forward_alloc);
=20
 	kfree(rcu_dereference_protected(inet->inet_opt, 1));
 	dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1));
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 321acc8abf17e8c7d6a4e3326615123fff19deab..efe2a085cf68e90cd1e79b5556e=
667a0fd044bfd 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -282,7 +282,7 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_conn=
ection_sock *icsk,
 		struct inet_diag_meminfo minfo =3D {
 			.idiag_rmem =3D sk_rmem_alloc_get(sk),
 			.idiag_wmem =3D READ_ONCE(sk->sk_wmem_queued),
-			.idiag_fmem =3D sk_forward_alloc_get(sk),
+			.idiag_fmem =3D READ_ONCE(sk->sk_forward_alloc),
 			.idiag_tmem =3D sk_wmem_alloc_get(sk),
 		};
=20
diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c
index 8996c73c9779b5fa804e6f913834cf1fe4d071e6..3f2e707a11d18922d7d9dd93e83=
15c1ab26eebc7 100644
--- a/net/sched/em_meta.c
+++ b/net/sched/em_meta.c
@@ -460,7 +460,7 @@ META_COLLECTOR(int_sk_fwd_alloc)
 		*err =3D -1;
 		return;
 	}
-	dst->value =3D sk_forward_alloc_get(sk);
+	dst->value =3D READ_ONCE(sk->sk_forward_alloc);
 }
=20
 META_COLLECTOR(int_sk_sndbuf)

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 963ED270EC3;
	Tue, 18 Feb 2025 18:36:49 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903809; cv=none;
 b=JehsqM5h1oUSZCq3V81DbeevHfbUGw2H9r/SHT2Bi8oK7hKwI9jJZZ6dfKsHQW3ExoXdWfd4xhF3/nFTNJ3FaHguD3zIRnkCXDHMDUj0pxVZOc6YzSwOPUdcTzL0x86rLgf3S8o6W8yj1EwqvwzM8Erd8w17aa1xgPyPcVCPeuc=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903809; c=relaxed/simple;
	bh=JHX0AHDU9Z3wO5E/PzOjf/tEoouvJ5ogW4zW/Wl9gQQ=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=YtbN/HmYpH5dIEEzxWO5HKKLPjLm6mnQBGCz3WuvnsGsNBGlFPN05CbHetpLBeNPK12PZaL+MwuepbzaXTyzLHbWJmBSJzfnSsp4TuoeP4LqXOJUOkUW399yV6dQ8GcBDWKYNorfKljsam1YelacN3mhA+JMmRBa9xRd5K7plqA=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=d0EnWWX7; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="d0EnWWX7"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB96AC4CEEB;
	Tue, 18 Feb 2025 18:36:45 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903809;
	bh=JHX0AHDU9Z3wO5E/PzOjf/tEoouvJ5ogW4zW/Wl9gQQ=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=d0EnWWX7njrWBA1vyWmv3LhqNoqBkpquxupxyPYu3V5LV7mO+p+r2kXXwRRbxzNrK
	 oRueJZ5JGZfYM3sRciHZCcvaDXztP48oST8amvk/H4S+yhT59dK0qfb59fvAoVAiHC
	 EjBSAzzECG0h8LeYH7e1TNEQIw574QZqHylT60VAkHXEXIcpHqDPz5xra/6FbKjqIN
	 qMRlDpM6aHXtZDK0akxi05KPG2fKGzS0PGsZ/Kez207cwzop0jEFxjZYPmJlhlHzIb
	 06iqlcVRGIy8tA8Q7AWnyv8toMtZ0K1+k5qNemFiErwZ3KFI87rdNserehdAeekQZh
	 s/JV7LQq4fxxg==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:17 +0100
Subject: [PATCH net-next 6/7] mptcp: dismiss __mptcp_rmem()
Precedence: bulk
X-Mailing-List: mptcp@lists.linux.dev
List-Id: <mptcp.lists.linux.dev>
List-Subscribe: <mailto:mptcp+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:mptcp+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-6-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=3152; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=eXI8QUSL7/rXzK6E2jx6P6T/20wyQGDw98FRoytZKY8=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMnBYnWH2koruYhyBP2MYhxM5254T/PsjBAo
 CYYDUvWX++JAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 c9VqEACMNiqYIb7EJmojhr3S0niZ+WGvzsXVYeE2zjbLAkS4hWaTkr9RL/HxRXXRz1xGXA3CuRj
 j2SDDsM9H8fgm9JCZbiu2GUWM7VKQEvC399NjvMQoNJZSAjNrOH0WuHl/F045z8cjKdJJ3gPKS/
 wGiBSsdivbSslbwU0ihBNDx1IiaUNGpBqRo6kX92gtV4ScDOZbPrhQRq3bgBhTimj2uuM3PlPPk
 B6fMpHT/Wco66ayBOAV51VUAbopML4OWdTrR8xP0Pa2ro3ak+cqXX+m3M8OJYrN22V0ZoZ6iwKL
 8NjBmkbo09W7KHHXqoOGnjp6uyFXoGaHZf6bSvFkvSAE0VcOhsxF8cdi1/rY/vguFsuVh0Fikiz
 KoKSJLbRxS3LH7wdLZmKdL5Ha7tHIks1NIsPVWYIVopFrilUPlvgT2VgZ/hVzYB/kHclu5pCQ4B
 F8P4gCT9fBNB7qu+1qwcd3dAGA8oV1+nHXjbi9DvvTto1WF+1MBz/ew1hMPWcXVwcW547c4yuKd
 65Y5BpQxi2DvYxrI6Newh3aMmwNQ51TiFJ+kWfaZsIooEr4/tjbWi53Rjz6iRAoZb27kS+irFSP
 HU8Y/N2rXmc0mF6UAMNqTMGbW1RXRsUtO3OclwhZICrI0Tm8jUIBN13K/TMv5cDmjSm690/Fo5i
 L8pn2rpcjcHhcLw==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

After the RX path refactor, it become a wrapper for sk_rmem_alloc
access, with a slightly misleading name. Just drop it.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.c |  8 ++++----
 net/mptcp/protocol.h | 11 ++---------
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 080877f8daf7e3ff36531f3e11079d2163676f2d..c709f654cd5a4944390cf1e160f=
59cd3b509b66d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -496,7 +496,7 @@ static void mptcp_cleanup_rbuf(struct mptcp_sock *msk, =
int copied)
 	bool cleanup, rx_empty;
=20
 	cleanup =3D (space > 0) && (space >=3D (old_space << 1)) && copied;
-	rx_empty =3D !__mptcp_rmem(sk) && copied;
+	rx_empty =3D !sk_rmem_alloc_get(sk) && copied;
=20
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow);
@@ -645,7 +645,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp=
_sock *msk,
 		WRITE_ONCE(tp->copied_seq, seq);
 		more_data_avail =3D mptcp_subflow_data_available(ssk);
=20
-		if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) {
+		if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf) {
 			done =3D true;
 			break;
 		}
@@ -782,7 +782,7 @@ static void __mptcp_data_ready(struct sock *sk, struct =
sock *ssk)
 	__mptcp_rcvbuf_update(sk, ssk);
=20
 	/* over limit? can't append more skbs to msk, Also, no need to wake-up*/
-	if (__mptcp_rmem(sk) > sk->sk_rcvbuf)
+	if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
 		return;
=20
 	/* Wake-up the reader only for in-sequence data */
@@ -2049,7 +2049,7 @@ static bool __mptcp_move_skbs(struct sock *sk)
 		mptcp_for_each_subflow(msk, subflow)
 			__mptcp_rcvbuf_update(sk, subflow->tcp_sock);
=20
-	if (__mptcp_rmem(sk) > sk->sk_rcvbuf)
+	if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
 		return false;
=20
 	do {
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 613d556ed938a99a2800b4384ee4c6cda9483381..a1a077bae7b6ec4fab5b266e261=
3acb145eb343f 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -380,14 +380,6 @@ static inline void msk_owned_by_me(const struct mptcp_=
sock *msk)
 #define mptcp_sk(ptr) container_of_const(ptr, struct mptcp_sock, sk.icsk_i=
net.sk)
 #endif
=20
-/* the msk socket don't use the backlog, also account for the bulk
- * free memory
- */
-static inline int __mptcp_rmem(const struct sock *sk)
-{
-	return atomic_read(&sk->sk_rmem_alloc);
-}
-
 static inline int mptcp_win_from_space(const struct sock *sk, int space)
 {
 	return __tcp_win_from_space(mptcp_sk(sk)->scaling_ratio, space);
@@ -400,7 +392,8 @@ static inline int mptcp_space_from_win(const struct soc=
k *sk, int win)
=20
 static inline int __mptcp_space(const struct sock *sk)
 {
-	return mptcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf) - __mptcp_rmem(s=
k));
+	return mptcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf) -
+				    sk_rmem_alloc_get(sk));
 }
=20
 static inline struct mptcp_data_frag *mptcp_send_head(const struct sock *s=
k)

--=20
2.47.1
From nobody Wed Apr 30 11:31:34 2025
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org
 [10.30.226.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 369511EB5DE;
	Tue, 18 Feb 2025 18:36:52 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=10.30.226.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739903814; cv=none;
 b=HSu9CyzRH4ljXG7bdZbYybNH4IP5fctUnZh62h2PLpTsHBg/Vos3B5NTptu+6PP6DRue/zIdgLXM+i39xGAWhgff4ODETzYzKIgX9nfF/ai0pvmB9yslIW3UVs+IvP+X40LL47GE2lQxlSig3iLF3jZUBJmgG8MTLPVynG7BU7k=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739903814; c=relaxed/simple;
	bh=FuEmwmBco4pfgcrOyMQPaOEQ5AMu1vAUJfMUV8FkDtc=;
	h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References:
	 In-Reply-To:To:Cc;
 b=nqClFfjVUZUCIiV3YSbUVshEc+snQau+VjGAF703QqmU5ueM/ZoX6Ymhkjv2RC5dKoaBEYkvsA96G/rer0+NOI7zTcxnQGiYgNGs8Ck0lLDZJyPkLJTRhCe7uMzWYFuuQuRLeI0mpTmZgHPaU1ZTK0f4r5mKoWZSv9ke8s1oHb8=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b=Ll7DrvDc; arc=none smtp.client-ip=10.30.226.201
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org
 header.b="Ll7DrvDc"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FF04C4CEE2;
	Tue, 18 Feb 2025 18:36:49 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1739903812;
	bh=FuEmwmBco4pfgcrOyMQPaOEQ5AMu1vAUJfMUV8FkDtc=;
	h=From:Date:Subject:References:In-Reply-To:To:Cc:From;
	b=Ll7DrvDc7wZJh0iSzzPZ8RkckswbLFwwQrdHbgPkcdKEBaXI+BZ64U6H7w0Krm/F1
	 80mM843qgQd1/37WYnhH+lL1uRTjVMwFKEVTDnDKxMJ4M3uPeBlqx3X1qU2wWUdXLk
	 o4nF1zqYq6oxcT0szQUc0rmIVoH5YQmQI+GfizpvPA+t1BBF1/4Cx4fv3hCGTZQ5Zp
	 Xa6IwUSZBtYbGozng/IpqcFNZkz4OM/uFRZIw9hsgXUrqGDLV2VBNIjDfLxEE6oUKz
	 29NcBhjv58HKPjOq7gmtVc4eq2TwWqRwKT4FkfBonKsfL/OtUQ2FZuQnWG75AZEXbt
	 07+8qN55+G14Q==
From: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
Date: Tue, 18 Feb 2025 19:36:18 +0100
Subject: [PATCH net-next 7/7] mptcp: micro-optimize __mptcp_move_skb()
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-7-4a47d90d7998@kernel.org>
References: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
In-Reply-To: 
 <20250218-net-next-mptcp-rx-path-refactor-v1-0-4a47d90d7998@kernel.org>
To: mptcp@lists.linux.dev, Mat Martineau <martineau@kernel.org>,
 Geliang Tang <geliang@kernel.org>, "David S. Miller" <davem@davemloft.net>,
 Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>,
 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
 Willem de Bruijn <willemb@google.com>, David Ahern <dsahern@kernel.org>,
 Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>,
 Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org,
 linux-kernel@vger.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@kernel.org>
X-Mailer: b4 0.14.2
X-Developer-Signature: v=1; a=openpgp-sha256; l=7810; i=matttbe@kernel.org;
 h=from:subject:message-id; bh=UUZ2l4CArzUWVM9zGO3re19UeTioW3f1Q/TZ8tu7/is=;
 b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBntNMn/JObq6JQyNbP7lEygvD9z7ysS+k4zTqCV
 sJ4zhYrwAaJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZ7TTJwAKCRD2t4JPQmmg
 cxayEACwPcuobyxN+pAtEAADdCZLdNZpT7IZ+EI/nbBxCQfPNFKpdUgTGWvL+C7WZDFcNP1NSzQ
 zDDrGo2Z1yQz1eVf26tE+tg1LeLe6eEPWzfQqWXP9NKHBQjGk8e/HAI7p2SlB9RuiJpktlgbTTW
 wYdymPLYZLPK0QXPgOmshMk2XWYn9aNl4V1hZTspRsj4DJxmTbRDqSIJR3dvJJ+yrPj3QF893a+
 lU8k6gLZAFVVqb6lEhHNyBwNs3HIckqiaJv3d+HJx76OoCAzH43X95gxWsg2yFB+F7d6Gw6wsml
 OhqxxCSMmB6MS5/QUN5YzJfoNwsCPMipn4esD4yueJf7P5ln8O4UEJpOt5p60UVXcMfNpc/+22q
 Xu9tvzQDlLucEc6TDl7TIXNh8JQiCFnFcVWbLDkdHrgNQgSY6XUT6KCb+iJhUtBO/mCYTwBdtS2
 DDEjhCsavdE4e1D+cBfY0ly/KXBOKV/K+8Ep8hIked6qfdf8s7USRbbULgDegmJ+mRC76xKbDms
 Oh/K+vYc4cPiXylehd4l0410OSA16NJFZkVqaX6X6BlqMqcdsxmmHwC0xNJG6a12Gf4cmgCWHeK
 4tJoc8Yv464IbhQacUABmcBycEkYjBS0gIDJIL46ZNZFk1PDUDLy54oUGfn775WrQGma3V6CwlQ
 rUR+8iHmSslTmtg==
X-Developer-Key: i=matttbe@kernel.org; a=openpgp;
 fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073

From: Paolo Abeni <pabeni@redhat.com>

After the RX path refactor the mentioned function is expected to run
frequently, let's optimize it a bit.

Scan for ready subflow from the last processed one, and stop after
traversing the list once or reaching the msk memory limit - instead of
looking for dubious per-subflow conditions.
Also re-order the memory limit checks, to avoid duplicate tests.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.c | 111 +++++++++++++++++++++++------------------------=
----
 net/mptcp/protocol.h |   2 +
 2 files changed, 52 insertions(+), 61 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index c709f654cd5a4944390cf1e160f59cd3b509b66d..6b61b7dee33be10294ae1101f92=
06144878a3192 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -569,15 +569,13 @@ static void mptcp_dss_corruption(struct mptcp_sock *m=
sk, struct sock *ssk)
 }
=20
 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
-					   struct sock *ssk,
-					   unsigned int *bytes)
+					   struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk);
 	struct sock *sk =3D (struct sock *)msk;
-	unsigned int moved =3D 0;
 	bool more_data_avail;
 	struct tcp_sock *tp;
-	bool done =3D false;
+	bool ret =3D false;
=20
 	pr_debug("msk=3D%p ssk=3D%p\n", msk, ssk);
 	tp =3D tcp_sk(ssk);
@@ -587,20 +585,16 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt=
cp_sock *msk,
 		struct sk_buff *skb;
 		bool fin;
=20
+		if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
+			break;
+
 		/* try to move as much data as available */
 		map_remaining =3D subflow->map_data_len -
 				mptcp_subflow_get_map_offset(subflow);
=20
 		skb =3D skb_peek(&ssk->sk_receive_queue);
-		if (!skb) {
-			/* With racing move_skbs_to_msk() and __mptcp_move_skbs(),
-			 * a different CPU can have already processed the pending
-			 * data, stop here or we can enter an infinite loop
-			 */
-			if (!moved)
-				done =3D true;
+		if (unlikely(!skb))
 			break;
-		}
=20
 		if (__mptcp_check_fallback(msk)) {
 			/* Under fallback skbs have no MPTCP extension and TCP could
@@ -613,19 +607,13 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt=
cp_sock *msk,
=20
 		offset =3D seq - TCP_SKB_CB(skb)->seq;
 		fin =3D TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN;
-		if (fin) {
-			done =3D true;
+		if (fin)
 			seq++;
-		}
=20
 		if (offset < skb->len) {
 			size_t len =3D skb->len - offset;
=20
-			if (tp->urg_data)
-				done =3D true;
-
-			if (__mptcp_move_skb(msk, ssk, skb, offset, len))
-				moved +=3D len;
+			ret =3D __mptcp_move_skb(msk, ssk, skb, offset, len) || ret;
 			seq +=3D len;
=20
 			if (unlikely(map_remaining < len)) {
@@ -639,22 +627,16 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt=
cp_sock *msk,
 			}
=20
 			sk_eat_skb(ssk, skb);
-			done =3D true;
 		}
=20
 		WRITE_ONCE(tp->copied_seq, seq);
 		more_data_avail =3D mptcp_subflow_data_available(ssk);
=20
-		if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf) {
-			done =3D true;
-			break;
-		}
 	} while (more_data_avail);
=20
-	if (moved > 0)
+	if (ret)
 		msk->last_data_recv =3D tcp_jiffies32;
-	*bytes +=3D moved;
-	return done;
+	return ret;
 }
=20
 static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
@@ -748,9 +730,9 @@ void __mptcp_error_report(struct sock *sk)
 static bool move_skbs_to_msk(struct mptcp_sock *msk, struct sock *ssk)
 {
 	struct sock *sk =3D (struct sock *)msk;
-	unsigned int moved =3D 0;
+	bool moved;
=20
-	__mptcp_move_skbs_from_subflow(msk, ssk, &moved);
+	moved =3D __mptcp_move_skbs_from_subflow(msk, ssk);
 	__mptcp_ofo_queue(msk);
 	if (unlikely(ssk->sk_err)) {
 		if (!sock_owned_by_user(sk))
@@ -766,7 +748,7 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, st=
ruct sock *ssk)
 	 */
 	if (mptcp_pending_data_fin(sk, NULL))
 		mptcp_schedule_work(sk);
-	return moved > 0;
+	return moved;
 }
=20
 static void __mptcp_rcvbuf_update(struct sock *sk, struct sock *ssk)
@@ -781,10 +763,6 @@ static void __mptcp_data_ready(struct sock *sk, struct=
 sock *ssk)
=20
 	__mptcp_rcvbuf_update(sk, ssk);
=20
-	/* over limit? can't append more skbs to msk, Also, no need to wake-up*/
-	if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
-		return;
-
 	/* Wake-up the reader only for in-sequence data */
 	if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk))
 		sk->sk_data_ready(sk);
@@ -884,20 +862,6 @@ bool mptcp_schedule_work(struct sock *sk)
 	return false;
 }
=20
-static struct sock *mptcp_subflow_recv_lookup(const struct mptcp_sock *msk)
-{
-	struct mptcp_subflow_context *subflow;
-
-	msk_owned_by_me(msk);
-
-	mptcp_for_each_subflow(msk, subflow) {
-		if (READ_ONCE(subflow->data_avail))
-			return mptcp_subflow_tcp_sock(subflow);
-	}
-
-	return NULL;
-}
-
 static bool mptcp_skb_can_collapse_to(u64 write_seq,
 				      const struct sk_buff *skb,
 				      const struct mptcp_ext *mpext)
@@ -2037,37 +2001,62 @@ static void mptcp_rcv_space_adjust(struct mptcp_soc=
k *msk, int copied)
 	msk->rcvq_space.time =3D mstamp;
 }
=20
+static struct mptcp_subflow_context *
+__mptcp_first_ready_from(struct mptcp_sock *msk,
+			 struct mptcp_subflow_context *subflow)
+{
+	struct mptcp_subflow_context *start_subflow =3D subflow;
+
+	while (!READ_ONCE(subflow->data_avail)) {
+		subflow =3D mptcp_next_subflow(msk, subflow);
+		if (subflow =3D=3D start_subflow)
+			return NULL;
+	}
+	return subflow;
+}
+
 static bool __mptcp_move_skbs(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow;
 	struct mptcp_sock *msk =3D mptcp_sk(sk);
-	unsigned int moved =3D 0;
-	bool ret, done;
+	bool ret =3D false;
+
+	if (list_empty(&msk->conn_list))
+		return false;
=20
 	/* verify we can move any data from the subflow, eventually updating */
 	if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
 		mptcp_for_each_subflow(msk, subflow)
 			__mptcp_rcvbuf_update(sk, subflow->tcp_sock);
=20
-	if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
-		return false;
-
-	do {
-		struct sock *ssk =3D mptcp_subflow_recv_lookup(msk);
+	subflow =3D list_first_entry(&msk->conn_list,
+				   struct mptcp_subflow_context, node);
+	for (;;) {
+		struct sock *ssk;
 		bool slowpath;
=20
-		if (unlikely(!ssk))
+		/*
+		 * As an optimization avoid traversing the subflows list
+		 * and ev. acquiring the subflow socket lock before baling out
+		 */
+		if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf)
 			break;
=20
-		slowpath =3D lock_sock_fast(ssk);
-		done =3D __mptcp_move_skbs_from_subflow(msk, ssk, &moved);
+		subflow =3D __mptcp_first_ready_from(msk, subflow);
+		if (!subflow)
+			break;
=20
+		ssk =3D mptcp_subflow_tcp_sock(subflow);
+		slowpath =3D lock_sock_fast(ssk);
+		ret =3D __mptcp_move_skbs_from_subflow(msk, ssk) || ret;
 		if (unlikely(ssk->sk_err))
 			__mptcp_error_report(sk);
 		unlock_sock_fast(ssk, slowpath);
-	} while (!done);
=20
-	ret =3D moved > 0 || __mptcp_ofo_queue(msk);
+		subflow =3D mptcp_next_subflow(msk, subflow);
+	}
+
+	__mptcp_ofo_queue(msk);
 	if (ret)
 		mptcp_check_data_fin((struct sock *)msk);
 	return ret;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a1a077bae7b6ec4fab5b266e2613acb145eb343f..ca65f8bff632ff806fe761f86e9=
aa065b0657d1e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -354,6 +354,8 @@ struct mptcp_sock {
 	list_for_each_entry(__subflow, &((__msk)->conn_list), node)
 #define mptcp_for_each_subflow_safe(__msk, __subflow, __tmp)			\
 	list_for_each_entry_safe(__subflow, __tmp, &((__msk)->conn_list), node)
+#define mptcp_next_subflow(__msk, __subflow)				\
+	list_next_entry_circular(__subflow, &((__msk)->conn_list), node)
=20
 extern struct genl_family mptcp_genl_family;
=20

--=20
2.47.1