From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DBEF238171 for ; Fri, 19 Sep 2025 15:53:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297223; cv=none; b=ar2H4IAyB5WkEgrp/f3oNYrhTS7hmrOZOgqgMXZYin16brb7Rp+oKaPvqtzncFdeaCs677JumUxxEivj8rHnYSuOTK5E6d9hqEPt9YmrWnMbQbJdWa9ckC5LzYxt7I3CdDUJcndLga8ntB2VXrHoHjriLRnEZBRPzIRW1iWbYCg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297223; c=relaxed/simple; bh=qdda2Wdi9fTdcrtN0KrGI1q1zV2gpAPEK1O0CItX9as=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=ober6EmvnC5oIGNdLkh9EYiQQ/B0hxCbxtcwBWsyxSM30beHAmqUbLV/B+Dmg8q5JuivjgSrRwLHhU4iWNEbgg0X5Fhny7mZyIA6B12jVUgVrd6iG0LjKO6t7NvJgEDz7SF4p82pPiCNZ0SSwCo1WdPHjBXrCS02N29rJih0p4I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Pfr3VsIT; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Pfr3VsIT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297220; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OObWxAKQ1wd0hNQYGk8h+U9TuSaEjC81KerF0JiIV8I=; b=Pfr3VsITG6eLJYZ3Sm/XWt7318/BiXg6mAzQl8gn2tEYHZueejYy7FFxiIzuJpu58kVLA/ mGo3K5ZNAXqJvJ6kPnf5KwI8xRYIRyxrgeJgyDJQl3PXSyz8AphdyuIIdcd7WVNddrDza2 Q26m0qW47uILE32kJTob+XgX68tl8nM= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-629-cqOt-_CqOvmrD9nadNiLwg-1; Fri, 19 Sep 2025 11:53:39 -0400 X-MC-Unique: cqOt-_CqOvmrD9nadNiLwg-1 X-Mimecast-MFC-AGG-ID: cqOt-_CqOvmrD9nadNiLwg_1758297218 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D75941800366 for ; Fri, 19 Sep 2025 15:53:38 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EB18E1800452 for ; Fri, 19 Sep 2025 15:53:37 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 01/12] mptcp: leverage skb deferral free Date: Fri, 19 Sep 2025 17:53:15 +0200 Message-ID: <57290c1e6cdd4a157f2bbb64a77d3afe6705c9e3.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: PUYLoB3wCuhQPNnkoIW_9FcAa7LjQZ-7aPxt8DPQFu8_1758297218 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Usage of the skb deferral API is straight-forward; with multiple subflows actives this allow moving part of the received application load into multiple CPUs. Also fix a typo in the related comment. Reviewed-by: Geliang Tang Tested-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 7933291e991ce..9d95d24781509 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1943,12 +1943,13 @@ static int __mptcp_recvmsg_mskq(struct sock *sk, } =20 if (!(flags & MSG_PEEK)) { - /* avoid the indirect call, we know the destructor is sock_wfree */ + /* avoid the indirect call, we know the destructor is sock_rfree */ skb->destructor =3D NULL; + skb->sk =3D NULL; atomic_sub(skb->truesize, &sk->sk_rmem_alloc); sk_mem_uncharge(sk, skb->truesize); __skb_unlink(skb, &sk->sk_receive_queue); - __kfree_skb(skb); + skb_attempt_defer_free(skb); msk->bytes_consumed +=3D count; } =20 --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F1792505AF for ; Fri, 19 Sep 2025 15:53:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297225; cv=none; b=kc3OMmyJqE5NAejUDzl+AD6WFxshC3ws99CLsdzDu9Pxqkeae2uIqBa+eycY/urKhlH1CfdUA1ucq8Gx/6p1p8T34FURtvA9gTqW8G4Eg+16VYpOWL+VTZhExWoWcpRTpWE34gF5Ft7cDHpBq5FiXc5jyJt1lY9UsO49O5kwhJk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297225; c=relaxed/simple; bh=GW5qJEs97fTHG9Vu/ULSl9rFKuTuAR3QGb0ChaZXjdQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=Ws3QewWka++nDpaRnoGfNCKeDsrTDkADtCA7OKmreRL9Xo1D5qBCDIZyo9OSkIiPQ60x7MNBSEw3Ly5z4LZMre0MUq09u7Ip3JQg55gxqbN+soJ1AvOBWGQ2rRMyGllJjcTMQJye90TJtr83WB6Vb4u6AjE0EiNEvDDBtSZR58w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=VbbGKsBN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VbbGKsBN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297222; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x3etxw24/n6GV8m/GDWSYf7hWuOKJrNtAg0hggNiRTc=; b=VbbGKsBN4DhGeNWp1s8T+uegawXw2fyct6Pkb/FlRfeOwecd21uJ4BcYeveK9w8vtbUGuP PH8s3zwbgW4ELo6X3wMCKanQiW2K6F+12DyGgerua4ut9SSr/3P81fpY1scxAbNMtHJY00 X7x2K4SuSaA9OjLNtFwmdLa/eJFB85o= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-516-kHJPSaJ2NWu_6wHuYfrOfA-1; Fri, 19 Sep 2025 11:53:41 -0400 X-MC-Unique: kHJPSaJ2NWu_6wHuYfrOfA-1 X-Mimecast-MFC-AGG-ID: kHJPSaJ2NWu_6wHuYfrOfA_1758297220 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 47DC81800365 for ; Fri, 19 Sep 2025 15:53:40 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 69502180035E for ; Fri, 19 Sep 2025 15:53:39 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 02/12] tcp: make tcp_rcvbuf_grow() accessible to mptcp code Date: Fri, 19 Sep 2025 17:53:16 +0200 Message-ID: <657458419130e9dc4ccfdfb7e89b6d8e86a9cd96.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: PV8wmiiVvdI6UBh8-RPBwr6OL0kCmHnKnIsoHzbHPzA_1758297220 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" To leverage the auto-tuning improvements brought by commit 2da35e4b4df9 ("Merge branch 'tcp-receive-side-improvements'"), the MPTCP stack need to access the mentioned helper. Reviewed-by: Geliang Tang Tested-by: Geliang Tang Acked-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni --- include/net/tcp.h | 1 + net/ipv4/tcp_input.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index e25340459ce4a..db2a4e05147fa 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -356,6 +356,7 @@ void tcp_delack_timer_handler(struct sock *sk); int tcp_ioctl(struct sock *sk, int cmd, int *karg); enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff= *skb); void tcp_rcv_established(struct sock *sk, struct sk_buff *skb); +void tcp_rcvbuf_grow(struct sock *sk); void tcp_rcv_space_adjust(struct sock *sk); int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp); void tcp_twsk_destructor(struct sock *sk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b2793e749cfd9..ad09995a1aaec 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -701,7 +701,7 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *= sk, } } =20 -static void tcp_rcvbuf_grow(struct sock *sk) +void tcp_rcvbuf_grow(struct sock *sk) { const struct net *net =3D sock_net(sk); struct tcp_sock *tp =3D tcp_sk(sk); --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1FF72C21F8 for ; Fri, 19 Sep 2025 15:53:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297226; cv=none; b=AFWac+uQGopJFaqDNAF4S5OyIbjt4ySADG/DXe93tyfOb0Nt6LficwdKAmiSnCUNj2EO9mXCapTPaNVf+r5i/ADhNJdsMkweFZ2gQKSWsIqRw7ZZDTi6wz3X+77cXztlhcrBMbk0MDNQ1x4MmRGJrFsWM10C5IxTcGbYGGpeCP8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297226; c=relaxed/simple; bh=ubCKXzaFwjqSZEdr2xKc5qEJhiq1yqyo3JSi6A9Va7M=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=Q8L4lVyI8G9QsY3MX/dziAd1YIN2ROZ6mi+8fvkASrTNDYGMygxekbGaiIKjyTd6P7NHUdBNutBhhniela6jVTSX4xuvTuKcpdP6AXvFlP79haqz8Mnr13eTMn9qDMfUXAaff4TVCPizyeMuQXjBo+uq7MFsfdodOnl261T3UEQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=dkD5o8zK; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dkD5o8zK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2lOEbne5y/2o7H2yMVxjqtL9Q4iFhY3IW+KbQaxCbN4=; b=dkD5o8zKupI8oSmMeWE0YIoCRHYIkvM0yc1taTE0cF6+VMSr+JuM88iUuEoy/l9LjZMGq1 Jo5KabH/6ImhTTxW6336u5Y0rwAUmwmDqQLjC72TSKzMbtFW3hP8DqTYFGSIZzbbMHRFWl /yQTXWfmUCSdFnA1zLw+XrDL/HMw8xg= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-625-xLmESz_kOOKarafJG89_Cw-1; Fri, 19 Sep 2025 11:53:42 -0400 X-MC-Unique: xLmESz_kOOKarafJG89_Cw-1 X-Mimecast-MFC-AGG-ID: xLmESz_kOOKarafJG89_Cw_1758297221 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id BDBCC180028C for ; Fri, 19 Sep 2025 15:53:41 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DB6DE1800452 for ; Fri, 19 Sep 2025 15:53:40 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 03/12] mptcp: rcvbuf auto-tuning improvement Date: Fri, 19 Sep 2025 17:53:17 +0200 Message-ID: <3b9f144fd4df4c02550946d4208a0ede8e776526.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: aMxpGy4VrT6CaRVaoJ61b9XlbMOOtJuA46L0p_qaszQ_1758297221 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Apply to the MPTCP auto-tuning the same improvements introduced for the TCP protocol by the merge commit 2da35e4b4df9 ("Merge branch 'tcp-receive-side-improvements'"). The main difference is that TCP subflow and the main MPTCP socket need to account separately for OoO: MPTCP does not care for TCP-level OoO and vice versa, as a consequence do not reflect MPTCP-level rcvbuf increase due to OoO packets at the subflow level. This refeactor additionally allow dropping the msk receive buffer update at receive time, as the latter only intended to cope with subflow receive buffer increase due to OoO packets. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/487 Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/559 Reviewed-by: Geliang Tang Tested-by: Geliang Tang Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) --- v2 -> v3: - copy tcp_rcvbuf_grow() implementation, verbatim - intentionally omitted Mat's tag due to the many changes v1 -> v2: - fix unused variable - reword the commit message --- net/mptcp/protocol.c | 97 +++++++++++++++++++++----------------------- net/mptcp/protocol.h | 4 +- 2 files changed, 49 insertions(+), 52 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 9d95d24781509..c9fcdbaf50874 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -179,6 +179,35 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *= msk, struct sk_buff *to, return mptcp_try_coalesce((struct sock *)msk, to, from); } =20 +/* "inspired" by tcp_rcvbuf_grow(), main difference: + * - mptcp does not maintain a msk-level window clamp + * - returns true when the receive buffer is actually updated + */ +static bool mptcp_rcvbuf_grow(struct sock *sk) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + const struct net *net =3D sock_net(sk); + int rcvwin, rcvbuf, cap; + + if (!READ_ONCE(net->ipv4.sysctl_tcp_moderate_rcvbuf) || + (sk->sk_userlocks & SOCK_RCVBUF_LOCK)) + return false; + + rcvwin =3D msk->rcvq_space.space << 1; + + if (!RB_EMPTY_ROOT(&msk->out_of_order_queue)) + rcvwin +=3D MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq; + + cap =3D READ_ONCE(net->ipv4.sysctl_tcp_rmem[2]); + + rcvbuf =3D min_t(u32, mptcp_space_from_win(sk, rcvwin), cap); + if (rcvbuf > sk->sk_rcvbuf) { + WRITE_ONCE(sk->sk_rcvbuf, rcvbuf); + return true; + } + return false; +} + /* "inspired" by tcp_data_queue_ofo(), main differences: * - use mptcp seqs * - don't cope with sacks @@ -292,6 +321,9 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *msk= , struct sk_buff *skb) end: skb_condense(skb); skb_set_owner_r(skb, sk); + /* do not grow rcvbuf for not-yet-accepted or orphaned sockets. */ + if (sk->sk_socket) + mptcp_rcvbuf_grow(sk); } =20 static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, @@ -784,18 +816,10 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, = struct sock *ssk) return moved; } =20 -static void __mptcp_rcvbuf_update(struct sock *sk, struct sock *ssk) -{ - if (unlikely(ssk->sk_rcvbuf > sk->sk_rcvbuf)) - WRITE_ONCE(sk->sk_rcvbuf, ssk->sk_rcvbuf); -} - static void __mptcp_data_ready(struct sock *sk, struct sock *ssk) { struct mptcp_sock *msk =3D mptcp_sk(sk); =20 - __mptcp_rcvbuf_update(sk, ssk); - /* Wake-up the reader only for in-sequence data */ if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk)) sk->sk_data_ready(sk); @@ -2014,48 +2038,26 @@ static void mptcp_rcv_space_adjust(struct mptcp_soc= k *msk, int copied) if (msk->rcvq_space.copied <=3D msk->rcvq_space.space) goto new_measure; =20 - if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf) && - !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { - u64 rcvwin, grow; - int rcvbuf; - - rcvwin =3D ((u64)msk->rcvq_space.copied << 1) + 16 * advmss; - - grow =3D rcvwin * (msk->rcvq_space.copied - msk->rcvq_space.space); - - do_div(grow, msk->rcvq_space.space); - rcvwin +=3D (grow << 1); - - rcvbuf =3D min_t(u64, mptcp_space_from_win(sk, rcvwin), - READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2])); - - if (rcvbuf > sk->sk_rcvbuf) { - u32 window_clamp; - - window_clamp =3D mptcp_win_from_space(sk, rcvbuf); - WRITE_ONCE(sk->sk_rcvbuf, rcvbuf); + msk->rcvq_space.space =3D msk->rcvq_space.copied; + if (mptcp_rcvbuf_grow(sk)) { =20 - /* Make subflows follow along. If we do not do this, we - * get drops at subflow level if skbs can't be moved to - * the mptcp rx queue fast enough (announced rcv_win can - * exceed ssk->sk_rcvbuf). - */ - mptcp_for_each_subflow(msk, subflow) { - struct sock *ssk; - bool slow; + /* Make subflows follow along. If we do not do this, we + * get drops at subflow level if skbs can't be moved to + * the mptcp rx queue fast enough (announced rcv_win can + * exceed ssk->sk_rcvbuf). + */ + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk; + bool slow; =20 - ssk =3D mptcp_subflow_tcp_sock(subflow); - slow =3D lock_sock_fast(ssk); - WRITE_ONCE(ssk->sk_rcvbuf, rcvbuf); - WRITE_ONCE(tcp_sk(ssk)->window_clamp, window_clamp); - if (tcp_can_send_ack(ssk)) - tcp_cleanup_rbuf(ssk, 1); - unlock_sock_fast(ssk, slow); - } + ssk =3D mptcp_subflow_tcp_sock(subflow); + slow =3D lock_sock_fast(ssk); + tcp_sk(ssk)->rcvq_space.space =3D msk->rcvq_space.copied; + tcp_rcvbuf_grow(ssk); + unlock_sock_fast(ssk, slow); } } =20 - msk->rcvq_space.space =3D msk->rcvq_space.copied; new_measure: msk->rcvq_space.copied =3D 0; msk->rcvq_space.time =3D mstamp; @@ -2084,11 +2086,6 @@ static bool __mptcp_move_skbs(struct sock *sk) if (list_empty(&msk->conn_list)) return false; =20 - /* verify we can move any data from the subflow, eventually updating */ - if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) - mptcp_for_each_subflow(msk, subflow) - __mptcp_rcvbuf_update(sk, subflow->tcp_sock); - subflow =3D list_first_entry(&msk->conn_list, struct mptcp_subflow_context, node); for (;;) { diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 9b5a248bad404..6ac58e92a1aa3 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -342,8 +342,8 @@ struct mptcp_sock { struct mptcp_pm_data pm; struct mptcp_sched_ops *sched; struct { - u32 space; /* bytes copied in last measurement window */ - u32 copied; /* bytes copied in this measurement window */ + int space; /* bytes copied in last measurement window */ + int copied; /* bytes copied in this measurement window */ u64 time; /* start time of measurement window */ u64 rtt_us; /* last maximum rtt of subflows */ } rcvq_space; --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AAD231CA4E for ; Fri, 19 Sep 2025 15:53:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297233; cv=none; b=tMXAfKUGYfhuvur1ZaeMuQaRIBcUgXejJKUO0gjbKcVzYyUf0XBKeT5fTStnVU0l8AeqOWz+DoHd8WsVqnpyxti87HxrLJVeS/S+w8VXRlzgK0XauofcWulcQ91/BODLOqsB9DDHMhgPWoHjreZIqzXVRZIPEkg6Bnc4psaz3ps= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297233; c=relaxed/simple; bh=mbdu3YO/yLycgRg8w/z7nvqT3cY1YpD3nMHXFxZl35I=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=apE4hbSDlTh5uJ9wQ2IyDQD2KHFmYwYMDgb2hALqdefp/AIpC+d+AsJfIRyr8YUz7gV3uQl6v6KBswY3X/ZCdjv2DGJsTl7n/s9i2B2Zaqr/cQ7VyA9cxNQtBUjHFQPm9lnUChxt9riigaRGTyfIXJhAhfKwgjXfB8bZaiDYR/0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EcUspT+0; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EcUspT+0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a0Y+027PCZr0jJZaLZPxCmWemuOJIGAGkRZ8KHVz8pY=; b=EcUspT+0m2e6nYIcjZ90kB3aXhnl+4hM1FpDEKDLEGg7l2PErtGhiwgEN7hGaNdAYu0CnB OWFAZOonVgQMOIn+IyMbdI0oUy4i3WW6AEMuK7A88CeuZam5WW1qQbLEFypnk5Nu61ajoE jT7gkNZI6KFXhXuNlPPQnlhJoIRBRJI= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-588-Q3pEm3W4N1a1WHikKJV4Aw-1; Fri, 19 Sep 2025 11:53:44 -0400 X-MC-Unique: Q3pEm3W4N1a1WHikKJV4Aw-1 X-Mimecast-MFC-AGG-ID: Q3pEm3W4N1a1WHikKJV4Aw_1758297223 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 286351800298 for ; Fri, 19 Sep 2025 15:53:43 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4AB86180035E for ; Fri, 19 Sep 2025 15:53:42 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 04/12] mptcp: introduce the mptcp_init_skb helper. Date: Fri, 19 Sep 2025 17:53:18 +0200 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: KQKx-jTvJSJwWdl99Zo4iZYRuFPpNOQbMpIFozkAKKk_1758297223 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Factor out all the skb initialization step in a new helper and use it. Note that this change moves the MPTCP CB initialization earlier: we can do such step as soon as the skb leaves the subflow socket receive queues. Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- v1 -> v2: - drop subflow argument from mptcp_init_skb() - change msk args to sock arg in __mptcp_move_skb() --- net/mptcp/protocol.c | 46 ++++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index c9fcdbaf50874..3aa03da781ba3 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -326,27 +326,11 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *m= sk, struct sk_buff *skb) mptcp_rcvbuf_grow(sk); } =20 -static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, - struct sk_buff *skb, unsigned int offset, - size_t copy_len) +static void mptcp_init_skb(struct sock *ssk, + struct sk_buff *skb, int offset, int copy_len) { - struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); - struct sock *sk =3D (struct sock *)msk; - struct sk_buff *tail; - bool has_rxtstamp; - - __skb_unlink(skb, &ssk->sk_receive_queue); - - skb_ext_reset(skb); - skb_orphan(skb); - - /* try to fetch required memory from subflow */ - if (!sk_rmem_schedule(sk, skb, skb->truesize)) { - MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); - goto drop; - } - - has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; + const struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); + bool has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; =20 /* the skb map_seq accounts for the skb offset: * mptcp_subflow_get_mapped_dsn() is based on the current tp->copied_seq @@ -358,6 +342,24 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, s= truct sock *ssk, MPTCP_SKB_CB(skb)->has_rxtstamp =3D has_rxtstamp; MPTCP_SKB_CB(skb)->cant_coalesce =3D 0; =20 + __skb_unlink(skb, &ssk->sk_receive_queue); + + skb_ext_reset(skb); + skb_dst_drop(skb); +} + +static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb) +{ + u64 copy_len =3D MPTCP_SKB_CB(skb)->end_seq - MPTCP_SKB_CB(skb)->map_seq; + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct sk_buff *tail; + + /* try to fetch required memory from subflow */ + if (!sk_rmem_schedule(sk, skb, skb->truesize)) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); + goto drop; + } + if (MPTCP_SKB_CB(skb)->map_seq =3D=3D msk->ack_seq) { /* in sequence */ msk->bytes_received +=3D copy_len; @@ -678,7 +680,9 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp= _sock *msk, if (offset < skb->len) { size_t len =3D skb->len - offset; =20 - ret =3D __mptcp_move_skb(msk, ssk, skb, offset, len) || ret; + mptcp_init_skb(ssk, skb, offset, len); + skb_orphan(skb); + ret =3D __mptcp_move_skb(sk, skb) || ret; seq +=3D len; =20 if (unlikely(map_remaining < len)) { --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C6772505AF for ; Fri, 19 Sep 2025 15:53:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297229; cv=none; b=rJ9W2zFnwiBiTHTay69CEePyhiitzY4VdzxsAPCikWV5Upimg3bt3DDESfEGJ6w8qXMNExYShRZG0g0RL1KGoiV7bvp+WiRx2FitEEEpDqE6VTgCgBICj5vxK0n7naRa5CgJh79pJ1DDbimKLhaeWSTvTDI3rvth5LuGSoLgKSE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297229; c=relaxed/simple; bh=iAk9vAqzYYHZFIYPQEV97rMnlF28ztGVhSBzCPolCZQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=MxGPEKDM1kuDMlgQOqxXJYFbv9wAmlL34j0OaNmlZuN1jvAy2uwWtP777oSHaAG5x9JvRlTZr1QQ0Y9HADXg8xPgauyoalQ7OMAX/CEN1WBUQeKSyHe8XtLtjxl6tidd1SdLZeu+F6RIPxGNbos/4od1myG06CPJyZ2rKWXmDKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TMj+qoAG; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TMj+qoAG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297226; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SB0keMRJCj4LZh81uFy12X67Aef/jHRbynACS4wQha4=; b=TMj+qoAGDXh7Ljfy77f3jZbqALGlWR854p8/H1n+L0QZ6GWeeZsHbr9NL1ITYVryYzlYh+ cm4weK1rZ5UgwbAcAS3hV5CKQJ3ae275TiJ/NESI9MiHFGF5OsqFFWjmTEz3+A+BQMMbaL ecoFvzYnpKyrSnfgm1eoL1v2zbkH92o= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-678-0uN0OJTUMVyRN-kc4kn6Zw-1; Fri, 19 Sep 2025 11:53:45 -0400 X-MC-Unique: 0uN0OJTUMVyRN-kc4kn6Zw-1 X-Mimecast-MFC-AGG-ID: 0uN0OJTUMVyRN-kc4kn6Zw_1758297225 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F141B1955D61 for ; Fri, 19 Sep 2025 15:53:44 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D86001800452 for ; Fri, 19 Sep 2025 15:53:43 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 05/12] mptcp: remove unneeded mptcp_move_skb() Date: Fri, 19 Sep 2025 17:53:19 +0200 Message-ID: <03ffaacc72b31e3b822cff59bbde8a9867ff34a0.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 61EwaH7HR6a60C82bquzsCYvEit73x0-t3zY0R_bccA_1758297225 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Since commit b7535cfed223 ("mptcp: drop legacy code around RX EOF"), sk_shutdown can't change during the main recvmsg loop, we can drop the related race breaker. Reviewed-by: Geliang Tang Tested-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3aa03da781ba3..909c611d5b528 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2207,14 +2207,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msg= hdr *msg, size_t len, break; } =20 - if (sk->sk_shutdown & RCV_SHUTDOWN) { - /* race breaker: the shutdown could be after the - * previous receive queue check - */ - if (__mptcp_move_skbs(sk)) - continue; + if (sk->sk_shutdown & RCV_SHUTDOWN) break; - } =20 if (sk->sk_state =3D=3D TCP_CLOSE) { copied =3D -ENOTCONN; --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5207C2C21F8 for ; Fri, 19 Sep 2025 15:53:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297230; cv=none; b=genxK9X0xvwJd8eB3dNIpT7tjJqvuum2y5pDczcXpw9k3y+UnYPs/jC974unIMbHZtCN+4AMvx34PlTTfIWy09vtIUy+6uG+uodpBgrlE9wgGFXnmUK46AV924V8KdHhjA8usBP6uN7/Vj7moq5/QcV5Wt8zZ8x6ua7iG8giqDk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297230; c=relaxed/simple; bh=tk0jaOjpxW2DkCYN0SuoutbqVV43j6pm1agowucL0UU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=qBgr1mb0q5hqCn8CJ0YsBrrkb0R2fw5ydUR2HMh+tWXKTDTxs8xMpEgJ3Olw3v3Isdq3ipY1C7LgBr1cWWew174smWaSnUFsVzMlOvchk0STPovXyLyg5TV9Lu2mjX43Rr0t6kpvx1SO8X3hRoOmofGRwFFhCPafmwy5gxR+bbQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JeUA6Wp3; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JeUA6Wp3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297228; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oydInhb1rU3mKN92O0oRxak9M7HFtyLZ73iNMMrffxw=; b=JeUA6Wp3AlnUfF1uQQYzf2qG6k9x10NzYwO8XYH040Atc4c2jUDoqHhT62n6/Oe/kF5FHh HaKJKy0B0ZVFHxEy5gSxseGRTQ7yWzK90Eyi3aRp0aa+QmBt4X8hWMUuWV5na3FYp/ivoe B7ybsDQf9P2ZKWhql66F4hRva9NIl9A= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-7-OPKPq5dqNaaGICwbCNL6YQ-1; Fri, 19 Sep 2025 11:53:46 -0400 X-MC-Unique: OPKPq5dqNaaGICwbCNL6YQ-1 X-Mimecast-MFC-AGG-ID: OPKPq5dqNaaGICwbCNL6YQ_1758297226 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1DA9318002C4 for ; Fri, 19 Sep 2025 15:53:46 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 43D6D1800452 for ; Fri, 19 Sep 2025 15:53:45 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 06/12] mptcp: factor out a basic skb coalesce helper Date: Fri, 19 Sep 2025 17:53:20 +0200 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: JTEqI37TM35g1LYcGhi2BGOQQVOpUXOArnRx0lSAX8o_1758297226 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" The upcoming patch will introduced backlog processing for MPTCP socket, and we want to leverage coalescing in such data path. Factor out the relevant bits not touching memory accounting to deal with such use-case. Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 909c611d5b528..bd83abefe4965 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -142,22 +142,34 @@ static void mptcp_drop(struct sock *sk, struct sk_buf= f *skb) __kfree_skb(skb); } =20 -static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, - struct sk_buff *from) +static int __mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, + struct sk_buff *from, bool *fragstolen) { - bool fragstolen; + int limit =3D READ_ONCE(sk->sk_rcvbuf); int delta; =20 if (unlikely(MPTCP_SKB_CB(to)->cant_coalesce) || MPTCP_SKB_CB(from)->offset || - ((to->len + from->len) > (sk->sk_rcvbuf >> 3)) || - !skb_try_coalesce(to, from, &fragstolen, &delta)) - return false; + ((to->len + from->len) > (limit >> 3)) || + !skb_try_coalesce(to, from, fragstolen, &delta)) + return 0; =20 pr_debug("colesced seq %llx into %llx new len %d new end seq %llx\n", MPTCP_SKB_CB(from)->map_seq, MPTCP_SKB_CB(to)->map_seq, to->len, MPTCP_SKB_CB(from)->end_seq); MPTCP_SKB_CB(to)->end_seq =3D MPTCP_SKB_CB(from)->end_seq; + return delta; +} + +static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, + struct sk_buff *from) +{ + bool fragstolen; + int delta; + + delta =3D __mptcp_try_coalesce(sk, to, from, &fragstolen); + if (!delta) + return false; =20 /* note the fwd memory can reach a negative value after accounting * for the delta, but the later skb free will restore a non --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8889B2505AF for ; Fri, 19 Sep 2025 15:53:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297232; cv=none; b=M7v0b2+opcT7vUhV/zPZKvRNCtTWieJyDugkjc3vKR4CQ7JxhLDgJjEUJjMf4IRewNiGV0608lOP8dyulf3nK/y9EP6OlT45VPwNayLvcFZZ22EW3hM50YKQcAs9Q+gSUDaYDGi4rvWurStsV6u3uxn6Ns/tCISbBfy0bJioAkQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297232; c=relaxed/simple; bh=OZFVg+889rXVi/oBMKCtybNgRWzxdyOV1T+3O6FiLwc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=W6k6IOQIdIRCip4JahlOoq1KZf+kcJUyHLLvnVbz6M4Fdoc76VAzE0RjKMf54BkQ/IhDoWg3yk1Gv5GTHLFXUONvAMwBzL+4o2BI9OC3exnb0rTtHwox+yiDCDUfJ4gWbgRb3R0MYFXX7RbOoAZxCueGtVymTqiSY0/yI6p9xmY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gNiwMaef; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gNiwMaef" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lGEBFFJ5OwUjJ8s8mWiJ1pdZeMC1VPFCVazhkHxkjb0=; b=gNiwMaefPXLmVgjEKf8Jkbf7C/8xo5HRUSb3c7b4CL13pw9bxNWO3zYdEpV9OOkOC9H5Hm yXhT1XnZF3DBqOVp1pRWSAoipLyyz4DMjVNfaGMFnHjFXb+twYuDtJcFId9QYRFNMa8wOH R/MuHd5v+FC+ZzkrwnkXcnPaIwCJQV8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-443-hY_iyLUgNuKoYKd5X5uFeg-1; Fri, 19 Sep 2025 11:53:48 -0400 X-MC-Unique: hY_iyLUgNuKoYKd5X5uFeg-1 X-Mimecast-MFC-AGG-ID: hY_iyLUgNuKoYKd5X5uFeg_1758297227 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 826491800298 for ; Fri, 19 Sep 2025 15:53:47 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9D9281800452 for ; Fri, 19 Sep 2025 15:53:46 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 07/12] mptcp: minor move_skbs_to_msk() cleanup Date: Fri, 19 Sep 2025 17:53:21 +0200 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: hvq9qN9BKoxdU5Qs0R94QaWxohVXOqlvVlqjr7reJfI_1758297227 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Such function is called only by __mptcp_data_ready(), which in turn is always invoked when msk is not owned by the user: we can drop the redundant, related check. Additionally mptcp needs to propagate the socket error only for current subflow. Reviewed-by: Geliang Tang Tested-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index bd83abefe4965..fce70cdad2a7f 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -815,12 +815,8 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, s= truct sock *ssk) =20 moved =3D __mptcp_move_skbs_from_subflow(msk, ssk); __mptcp_ofo_queue(msk); - if (unlikely(ssk->sk_err)) { - if (!sock_owned_by_user(sk)) - __mptcp_error_report(sk); - else - __set_bit(MPTCP_ERROR_REPORT, &msk->cb_flags); - } + if (unlikely(ssk->sk_err)) + __mptcp_subflow_error_report(sk, ssk); =20 /* If the moves have caught up with the DATA_FIN sequence number * it's time to ack the DATA_FIN and change socket state, but --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1039B31CA72 for ; Fri, 19 Sep 2025 15:53:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297233; cv=none; b=t8NOKEu1Hw+hvqc9ARCItBcMgyeLBkcxceX2AXRFSdvGiW26sUfLFV+WXoCIJh91YuaS4BK+IW5dqlq3aiFLtBVjEtsf/uUrv6+BtYNzb4uN0g2V83Kwo+ybJ9Qa8cR6hNMhMKCbIohHi9fqRSRERyDDQD5kchFzJkHSOlcFTiw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297233; c=relaxed/simple; bh=XuJG5/sJHfin2N+OpNjYOCGv2SEQQwQiIY760zsHBFc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=rT5120TjWYf37+56witaYv3KDUF9INMRbcPKaDQdoloqRL/SLjlj3c2fRc6XtSjJudhCTt/qtCjycCZelBGukwIqA8FmafZxQkm9ShAHlsN0eTyFqDVWDaNbkDQu1CebGfnRDS/87S0l0WcgTAEzuA3HdRYLFn0OdcGTyCRr5VI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YtyoXMvH; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YtyoXMvH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297231; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sJYzfpmni0bBnSsOT+Wc4kDdBSkWFUQzBLXgN+BULkw=; b=YtyoXMvHOxKlbrs8Xm22khYt0yfIKbIDrV+P96kltY6xMvj3IKbzbhL71K4xCNWjj/5pbE bDE59s7B67J4lVSwiqbwcrC9O+GIvxh91zTj1njsJ6he0FdcUhG18WQxGFij2qnMhrKaVW LqXQLt+4qfbKIy0EqcTfKM2p9V8n2b8= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-670-0v_ITWhaN2-NIjWD-0xm5Q-1; Fri, 19 Sep 2025 11:53:49 -0400 X-MC-Unique: 0v_ITWhaN2-NIjWD-0xm5Q-1 X-Mimecast-MFC-AGG-ID: 0v_ITWhaN2-NIjWD-0xm5Q_1758297229 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E11401800451 for ; Fri, 19 Sep 2025 15:53:48 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0EF0F180035E for ; Fri, 19 Sep 2025 15:53:47 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 08/12] mptcp: cleanup fallback data fin reception Date: Fri, 19 Sep 2025 17:53:22 +0200 Message-ID: <48b486e75249911cbee4325706622116e3a41a66.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: fc9j5uLDdJ_9A3gMeD_9BiEKhMMsDJwmltyEUkH5Xp8_1758297229 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" MPTCP currently generate a dummy data_fin for fallback socket when the fallback subflow has completed data reception using the current ack_seq. We are going to introduce backlog usage for the msk soon, even for fallback sockets: the ack_seq value will not match the most recent sequence number seen by the fallback subflow socket, as it will ignore data_seq sitting in the backlog. Instead use the last map sequence number to set the data_fin, as fallback (dummy) map sequences are always in sequence. Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- v2 -> v3: - keep the close check in subflow_sched_work_if_closed, fix CI failures --- net/mptcp/subflow.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e8325890a3223..b9455c04e8a46 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1285,6 +1285,7 @@ static bool subflow_is_done(const struct sock *sk) /* sched mptcp worker for subflow cleanup if no more data is pending */ static void subflow_sched_work_if_closed(struct mptcp_sock *msk, struct so= ck *ssk) { + const struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); struct sock *sk =3D (struct sock *)msk; =20 if (likely(ssk->sk_state !=3D TCP_CLOSE && @@ -1303,7 +1304,8 @@ static void subflow_sched_work_if_closed(struct mptcp= _sock *msk, struct sock *ss */ if (__mptcp_check_fallback(msk) && subflow_is_done(ssk) && msk->first =3D=3D ssk && - mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true)) + mptcp_update_rcv_data_fin(msk, subflow->map_seq + + subflow->map_data_len, true)) mptcp_schedule_work(sk); } =20 --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E8C7238171 for ; Fri, 19 Sep 2025 15:53:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297234; cv=none; b=YimiNm182OEfF/azzBIgmpHGX5gDIP2bu/bKQB9WlX65/v2VeU7co08I04cSo6YOYZCoU1HDUfKhGCGEVQ6ghS5hqh+Jo32dcHjpOFkfsOD/NKtC24QsbSM4FoVJjF6t0cUet6lgBfDi7tubmHB9qRgOyZzlB/d8kiGMhWtbRAk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297234; c=relaxed/simple; bh=cHC7SyoQaWzjR2vS6Foj3Ii6NhzJkF7W5iCJMDffH7U=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=PbI7AEHMFFBI6HDB7OnwkFM02AonvlqY0euRBpr7/JTJLEflQSwxriKhEadioeqLmAknXeNGeQilnm3TLc1D4WcyMiZgGX2lDH/Z9ib5O0wIDXQ1edtcHE5+u6W7eU7HqW4F99dvOHuLbCVNBeX7gHH2/UGfTV4b79or7i2faK8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XQC1hedw; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XQC1hedw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O03gDUmDLsG09MzHoxkVK8DWVnI4Flggz/31DjtdxW0=; b=XQC1hedwv9CIRHhsLpTvWCnXHCnrmL7t0dWxxn5IP+6IL34MHeqjbKoHhbJh9JsU1xmuG5 M5dHRciE9sKngOCqI47QfO1bG1InTSSEsBMcR5XksxWN3uJ9fqs5wzgRIMieBnZR8KEe/+ cYSvXycfyCI2ojL5SjIDiklD1Hgxlug= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-155-_oWfJmtdMFuJzyckCQbTyQ-1; Fri, 19 Sep 2025 11:53:51 -0400 X-MC-Unique: _oWfJmtdMFuJzyckCQbTyQ-1 X-Mimecast-MFC-AGG-ID: _oWfJmtdMFuJzyckCQbTyQ_1758297230 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 46A8A1955E7E for ; Fri, 19 Sep 2025 15:53:50 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6BC2E1800452 for ; Fri, 19 Sep 2025 15:53:49 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 09/12] mptcp: cleanup fallback dummy mapping generation. Date: Fri, 19 Sep 2025 17:53:23 +0200 Message-ID: <31bedb1bf0f3448fa804d3ef8d41fada09636a26.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ZRQFE7lqcYNUm-S4KYKqS9Tyxj25OWtfyWNUyluu7hs_1758297230 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" MPTCP currently access ack_seq outside the msk socket log scope to generate the dummy mapping for fallback socket. Soon we are going to introduce backlog usage and even for fallback socket the ack_seq value will be significantly off outside of the msk socket lock scope. Avoid relying on ack_seq for dummy mapping generation, using instead the subflow sequence number. Note that in case of disconnect() and (re)connect() we must ensure that any previous state is re-set. Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- v2 -> v3: - reordered before the backlog introduction to avoid transiently break the fallback - explicitly reset ack_seq --- net/mptcp/protocol.c | 3 +++ net/mptcp/subflow.c | 8 +++++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index fce70cdad2a7f..c8b02048126a9 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3224,6 +3224,9 @@ static int mptcp_disconnect(struct sock *sk, int flag= s) msk->bytes_retrans =3D 0; msk->rcvspace_init =3D 0; =20 + /* for fallback's sake */ + WRITE_ONCE(msk->ack_seq, 0); + WRITE_ONCE(sk->sk_shutdown, 0); sk_error_report(sk); return 0; diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index b9455c04e8a46..ac8616e7521e8 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -491,6 +491,9 @@ static void subflow_set_remote_key(struct mptcp_sock *m= sk, mptcp_crypto_key_sha(subflow->remote_key, NULL, &subflow->iasn); subflow->iasn++; =20 + /* for fallback's sake */ + subflow->map_seq =3D subflow->iasn; + WRITE_ONCE(msk->remote_key, subflow->remote_key); WRITE_ONCE(msk->ack_seq, subflow->iasn); WRITE_ONCE(msk->can_ack, true); @@ -1435,9 +1438,12 @@ static bool subflow_check_data_avail(struct sock *ss= k) =20 skb =3D skb_peek(&ssk->sk_receive_queue); subflow->map_valid =3D 1; - subflow->map_seq =3D READ_ONCE(msk->ack_seq); subflow->map_data_len =3D skb->len; subflow->map_subflow_seq =3D tcp_sk(ssk)->copied_seq - subflow->ssn_offse= t; + subflow->map_seq =3D __mptcp_expand_seq(subflow->map_seq, + subflow->iasn + + TCP_SKB_CB(skb)->seq - + subflow->ssn_offset - 1); WRITE_ONCE(subflow->data_avail, true); return true; } --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE34C238171 for ; Fri, 19 Sep 2025 15:53:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297237; cv=none; b=OttbSOUSJc6e7kZ8UibzmN+U19N4tNQvlRvIKTgo6TL+3BP6XZEjo0VexBiUbjnSYLcbUkEWRWO0/14ta9rJ3ngXZbsQtY6HQFAnxM8FhE/ExaSWTlbZTCViWpKkjXrL/xg+iYneqfvXv8Kntr+0wmX/q67MQsoe44nIgn5svAc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297237; c=relaxed/simple; bh=K4p8gtktm/tuB0liGUqVkJlJ8oAnLO3X76hg7jyRizw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=XxFW89rV2OLESpPIEA+vh9t5YA2Hw8pVkE8q0/rGmZa+A3nBhSOAfMYHc9PHJbybb55Fp7zZxZ+1fXiVmJ5j6UWcHT0OJiZuPUrDFlGoKRG3F3NXFUCfBVPSdE5vEJDUBJGibM8Om6gDift7rydPinDHSXuFsr+WZMcyPOeUPt0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=SlyUGR+P; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SlyUGR+P" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297234; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=roCOWYSlmbnc6C2OWUhmornSiXR8e+hTye8Q6K9kfE0=; b=SlyUGR+PRXYUhuLK2TR5QE/G/74V7s//VURbTQQ1SCugSG/Vqj+mXoUE5BPmk8D/IS5fGR stRxtjgC0z4EG+FvW64ZP5O70ubXQq95F8a0zwRL4JSoOM5iyEZG+6PswE+hHn92UPiQ5q oNS7ZMqzR+tDkWiR34X/UjM1TkTYbSA= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-600-qDBjnDpcNPixJmCm4S_Hgg-1; Fri, 19 Sep 2025 11:53:52 -0400 X-MC-Unique: qDBjnDpcNPixJmCm4S_Hgg-1 X-Mimecast-MFC-AGG-ID: qDBjnDpcNPixJmCm4S_Hgg_1758297231 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A2CC9180034F for ; Fri, 19 Sep 2025 15:53:51 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C3C331800452 for ; Fri, 19 Sep 2025 15:53:50 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 10/12] mptcp: leverage the sk backlog for RX packet processing. Date: Fri, 19 Sep 2025 17:53:24 +0200 Message-ID: <5d137f1505b6d41092fcbc50e1ad65c8f0cc9440.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: rYFqJG-lPlxdqEmygiComLoC-HeelZ_EYTSn5b4wqg8_1758297231 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" This streamline the RX path implementation and improves the RX performances by reducing the subflow-level locking and the amount of work done under the msk socket lock; the implementation mirror closely the TCP backlog processing. Note that MPTCP needs now to traverse the existing subflow looking for data that was left there due to the msk receive buffer full, only after that recvmsg completely empties the receive queue. Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- net/mptcp/protocol.c | 103 ++++++++++++++++++++++++++++++------------- net/mptcp/protocol.h | 2 +- 2 files changed, 73 insertions(+), 32 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index c8b02048126a9..201e6ac5fe631 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -360,6 +360,27 @@ static void mptcp_init_skb(struct sock *ssk, skb_dst_drop(skb); } =20 +static void __mptcp_add_backlog(struct sock *sk, struct sock *ssk, + struct sk_buff *skb) +{ + struct sk_buff *tail =3D sk->sk_backlog.tail; + bool fragstolen; + int delta; + + if (tail && MPTCP_SKB_CB(skb)->map_seq =3D=3D MPTCP_SKB_CB(tail)->end_seq= ) { + delta =3D __mptcp_try_coalesce(sk, tail, skb, &fragstolen); + if (delta) { + sk->sk_backlog.len +=3D delta; + kfree_skb_partial(skb, fragstolen); + return; + } + } + + /* mptcp checks the limit before adding the skb to the backlog */ + __sk_add_backlog(sk, skb); + sk->sk_backlog.len +=3D skb->truesize; +} + static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb) { u64 copy_len =3D MPTCP_SKB_CB(skb)->end_seq - MPTCP_SKB_CB(skb)->map_seq; @@ -648,7 +669,7 @@ static void mptcp_dss_corruption(struct mptcp_sock *msk= , struct sock *ssk) } =20 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, - struct sock *ssk) + struct sock *ssk, bool own_msk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); struct sock *sk =3D (struct sock *)msk; @@ -659,12 +680,13 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt= cp_sock *msk, pr_debug("msk=3D%p ssk=3D%p\n", msk, ssk); tp =3D tcp_sk(ssk); do { + int mem =3D own_msk ? sk_rmem_alloc_get(sk) : sk->sk_backlog.len; u32 map_remaining, offset; u32 seq =3D tp->copied_seq; struct sk_buff *skb; bool fin; =20 - if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf) + if (mem > READ_ONCE(sk->sk_rcvbuf)) break; =20 /* try to move as much data as available */ @@ -694,7 +716,11 @@ static bool __mptcp_move_skbs_from_subflow(struct mptc= p_sock *msk, =20 mptcp_init_skb(ssk, skb, offset, len); skb_orphan(skb); - ret =3D __mptcp_move_skb(sk, skb) || ret; + + if (own_msk) + ret |=3D __mptcp_move_skb(sk, skb); + else + __mptcp_add_backlog(sk, ssk, skb); seq +=3D len; =20 if (unlikely(map_remaining < len)) { @@ -715,7 +741,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp= _sock *msk, =20 } while (more_data_avail); =20 - if (ret) + if (ret && own_msk) msk->last_data_recv =3D tcp_jiffies32; return ret; } @@ -813,7 +839,7 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, st= ruct sock *ssk) struct sock *sk =3D (struct sock *)msk; bool moved; =20 - moved =3D __mptcp_move_skbs_from_subflow(msk, ssk); + moved =3D __mptcp_move_skbs_from_subflow(msk, ssk, true); __mptcp_ofo_queue(msk); if (unlikely(ssk->sk_err)) __mptcp_subflow_error_report(sk, ssk); @@ -828,18 +854,10 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, = struct sock *ssk) return moved; } =20 -static void __mptcp_data_ready(struct sock *sk, struct sock *ssk) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - /* Wake-up the reader only for in-sequence data */ - if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk)) - sk->sk_data_ready(sk); -} - void mptcp_data_ready(struct sock *sk, struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); + struct mptcp_sock *msk =3D mptcp_sk(sk); =20 /* The peer can send data while we are shutting down this * subflow at msk destruction time, but we must avoid enqueuing @@ -849,13 +867,33 @@ void mptcp_data_ready(struct sock *sk, struct sock *s= sk) return; =20 mptcp_data_lock(sk); - if (!sock_owned_by_user(sk)) - __mptcp_data_ready(sk, ssk); - else - __set_bit(MPTCP_DEQUEUE, &mptcp_sk(sk)->cb_flags); + if (!sock_owned_by_user(sk)) { + /* Wake-up the reader only for in-sequence data */ + if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk)) + sk->sk_data_ready(sk); + } else { + __mptcp_move_skbs_from_subflow(msk, ssk, false); + if (unlikely(ssk->sk_err)) + __set_bit(MPTCP_ERROR_REPORT, &msk->cb_flags); + } mptcp_data_unlock(sk); } =20 +static int mptcp_move_skb(struct sock *sk, struct sk_buff *skb) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + + if (__mptcp_move_skb(sk, skb)) { + msk->last_data_recv =3D tcp_jiffies32; + __mptcp_ofo_queue(msk); + /* notify ack seq update */ + mptcp_cleanup_rbuf(msk, 0); + mptcp_check_data_fin(sk); + sk->sk_data_ready(sk); + } + return 0; +} + static void mptcp_subflow_joined(struct mptcp_sock *msk, struct sock *ssk) { mptcp_subflow_ctx(ssk)->map_seq =3D READ_ONCE(msk->ack_seq); @@ -2117,7 +2155,7 @@ static bool __mptcp_move_skbs(struct sock *sk) =20 ssk =3D mptcp_subflow_tcp_sock(subflow); slowpath =3D lock_sock_fast(ssk); - ret =3D __mptcp_move_skbs_from_subflow(msk, ssk) || ret; + ret =3D __mptcp_move_skbs_from_subflow(msk, ssk, true) || ret; if (unlikely(ssk->sk_err)) __mptcp_error_report(sk); unlock_sock_fast(ssk, slowpath); @@ -2193,8 +2231,12 @@ static int mptcp_recvmsg(struct sock *sk, struct msg= hdr *msg, size_t len, =20 copied +=3D bytes_read; =20 - if (skb_queue_empty(&sk->sk_receive_queue) && __mptcp_move_skbs(sk)) - continue; + if (skb_queue_empty(&sk->sk_receive_queue)) { + __sk_flush_backlog(sk); + if (!skb_queue_empty(&sk->sk_receive_queue) || + __mptcp_move_skbs(sk)) + continue; + } =20 /* only the MPTCP socket status is relevant here. The exit * conditions mirror closely tcp_recvmsg() @@ -2542,7 +2584,6 @@ static void __mptcp_close_subflow(struct sock *sk) =20 mptcp_close_ssk(sk, ssk, subflow); } - } =20 static bool mptcp_close_tout_expired(const struct sock *sk) @@ -3126,6 +3167,13 @@ bool __mptcp_close(struct sock *sk, long timeout) pr_debug("msk=3D%p state=3D%d\n", sk, sk->sk_state); mptcp_pm_connection_closed(msk); =20 + /* process the backlog; note that it never destroies the msk */ + local_bh_disable(); + bh_lock_sock(sk); + __release_sock(sk); + bh_unlock_sock(sk); + local_bh_enable(); + if (sk->sk_state =3D=3D TCP_CLOSE) { __mptcp_destroy_sock(sk); do_cancel_work =3D true; @@ -3429,8 +3477,7 @@ void __mptcp_check_push(struct sock *sk, struct sock = *ssk) =20 #define MPTCP_FLAGS_PROCESS_CTX_NEED (BIT(MPTCP_PUSH_PENDING) | \ BIT(MPTCP_RETRANSMIT) | \ - BIT(MPTCP_FLUSH_JOIN_LIST) | \ - BIT(MPTCP_DEQUEUE)) + BIT(MPTCP_FLUSH_JOIN_LIST)) =20 /* processes deferred events and flush wmem */ static void mptcp_release_cb(struct sock *sk) @@ -3464,11 +3511,6 @@ static void mptcp_release_cb(struct sock *sk) __mptcp_push_pending(sk, 0); if (flags & BIT(MPTCP_RETRANSMIT)) __mptcp_retrans(sk); - if ((flags & BIT(MPTCP_DEQUEUE)) && __mptcp_move_skbs(sk)) { - /* notify ack seq update */ - mptcp_cleanup_rbuf(msk, 0); - sk->sk_data_ready(sk); - } =20 cond_resched(); spin_lock_bh(&sk->sk_lock.slock); @@ -3704,8 +3746,6 @@ static int mptcp_ioctl(struct sock *sk, int cmd, int = *karg) return -EINVAL; =20 lock_sock(sk); - if (__mptcp_move_skbs(sk)) - mptcp_cleanup_rbuf(msk, 0); *karg =3D mptcp_inq_hint(sk); release_sock(sk); break; @@ -3817,6 +3857,7 @@ static struct proto mptcp_prot =3D { .sendmsg =3D mptcp_sendmsg, .ioctl =3D mptcp_ioctl, .recvmsg =3D mptcp_recvmsg, + .backlog_rcv =3D mptcp_move_skb, .release_cb =3D mptcp_release_cb, .hash =3D mptcp_hash, .unhash =3D mptcp_unhash, diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 6ac58e92a1aa3..7bfd4e0d21a8a 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -124,7 +124,6 @@ #define MPTCP_FLUSH_JOIN_LIST 5 #define MPTCP_SYNC_STATE 6 #define MPTCP_SYNC_SNDBUF 7 -#define MPTCP_DEQUEUE 8 =20 struct mptcp_skb_cb { u64 map_seq; @@ -408,6 +407,7 @@ static inline int mptcp_space_from_win(const struct soc= k *sk, int win) static inline int __mptcp_space(const struct sock *sk) { return mptcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf) - + READ_ONCE(sk->sk_backlog.len) - sk_rmem_alloc_get(sk)); } =20 --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D3562505AF for ; Fri, 19 Sep 2025 15:53:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297237; cv=none; b=rIegUxCOHmWsuuBPL1nduFjK2S5EtaCysXWe0lPeOEF5zrbZy8BXEa0CNjk+KG5rHCuXfaucWdjV5UW6kJEM0Hd3eBA5bpsG7cgNGTMIhUI4SXBA+Bakh6hk2PR78FQLql/a8Neq11/3xAm3OHS9xjfh8AZPMjrlEKyAR0AnjtM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297237; c=relaxed/simple; bh=m0GT7gE7Q6Rb6irZ5YomX4U0/rGNe8slelTVThvrDrM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=WcjfbN031P7CQsDSMM7ZlvR52e6m/gnYK/G7ut9YJ7Dkt7E+7VMCr33WWLz/fpvGjMCDz9lsvGZiIB0XOi0FxiZWmHHNXjCoTJE6gWMc43HczHXMEKroAFutsM+fAJIpnfp6RCqU8sRGpRnh7VtfmLhDv8G4GBzMqJBS+TtkqoE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=czY9c4wa; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="czY9c4wa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297235; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tAgSptIs6u3HNzfo35xURYw/mO0S9ufu/jIMp4vzgUs=; b=czY9c4waBTE3n9mZDmlaNSHY8+aExssLU7Xhpv8MyzNYXTIzsSMgWRSEpeqYTs7fE+o0gi LIDP9xEBGb+Vc6yjUI16Jp/w3AxZdS5jrEDmc5PYQ/8mYgTPuCx6L5wZio1ucONIEMio9g xUKn+ssJn5Y5ModYShNnkL1O1fGflW4= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-321-Wm8dNdnJPaeVSZ2t1hi-4g-1; Fri, 19 Sep 2025 11:53:53 -0400 X-MC-Unique: Wm8dNdnJPaeVSZ2t1hi-4g-1 X-Mimecast-MFC-AGG-ID: Wm8dNdnJPaeVSZ2t1hi-4g_1758297233 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0854419560AE for ; Fri, 19 Sep 2025 15:53:53 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2C35918004D8 for ; Fri, 19 Sep 2025 15:53:51 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 11/12] mptcp: prevernt __mptcp_move_skbs() interfering with the fastpath Date: Fri, 19 Sep 2025 17:53:25 +0200 Message-ID: <3ee365d367054da93f2532a1727dde3224aa147f.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: pWb94lrhW3tiksgFQkc_wwT3HrKGXU3yOjC8Qlds-E4_1758297233 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" skbs will be left waiting in the subflow only in exceptional cases, we want to avoid messing with the fast path by unintentionally processing in __mptcp_move_skbs() packets landed into the subflows after the last check. Use a separate flag to mark delayed skbs and only process subflow with such flag set. Also add new mibs to track the exceptional events. Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- v1 -> v2: - rebased --- net/mptcp/mib.c | 2 ++ net/mptcp/mib.h | 4 ++++ net/mptcp/protocol.c | 40 ++++++++++++---------------------------- net/mptcp/protocol.h | 1 + 4 files changed, 19 insertions(+), 28 deletions(-) diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c index 6003e47c770a7..ac5ccf81159de 100644 --- a/net/mptcp/mib.c +++ b/net/mptcp/mib.c @@ -85,6 +85,8 @@ static const struct snmp_mib mptcp_snmp_list[] =3D { SNMP_MIB_ITEM("DssFallback", MPTCP_MIB_DSSFALLBACK), SNMP_MIB_ITEM("SimultConnectFallback", MPTCP_MIB_SIMULTCONNFALLBACK), SNMP_MIB_ITEM("FallbackFailed", MPTCP_MIB_FALLBACKFAILED), + SNMP_MIB_ITEM("RcvDelayed", MPTCP_MIB_RCVDELAYED), + SNMP_MIB_ITEM("DelayedProcess", MPTCP_MIB_DELAYED_PROCESS), }; =20 /* mptcp_mib_alloc - allocate percpu mib counters diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h index 309bac6fea325..f6d0eaea463e5 100644 --- a/net/mptcp/mib.h +++ b/net/mptcp/mib.h @@ -88,6 +88,10 @@ enum linux_mptcp_mib_field { MPTCP_MIB_DSSFALLBACK, /* Bad or missing DSS */ MPTCP_MIB_SIMULTCONNFALLBACK, /* Simultaneous connect */ MPTCP_MIB_FALLBACKFAILED, /* Can't fallback due to msk status */ + MPTCP_MIB_RCVDELAYED, /* Data move from subflow is delayed due to msk + * receive buffer full + */ + MPTCP_MIB_DELAYED_PROCESS, /* Delayed data moved in slowpath */ __MPTCP_MIB_MAX }; =20 diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 201e6ac5fe631..2a025c0c4ca0c 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -681,13 +681,17 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt= cp_sock *msk, tp =3D tcp_sk(ssk); do { int mem =3D own_msk ? sk_rmem_alloc_get(sk) : sk->sk_backlog.len; + bool over_limit =3D mem > READ_ONCE(sk->sk_rcvbuf); u32 map_remaining, offset; u32 seq =3D tp->copied_seq; struct sk_buff *skb; bool fin; =20 - if (mem > READ_ONCE(sk->sk_rcvbuf)) + WRITE_ONCE(subflow->data_delayed, over_limit); + if (subflow->data_delayed) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVDELAYED); break; + } =20 /* try to move as much data as available */ map_remaining =3D subflow->map_data_len - @@ -2113,32 +2117,13 @@ static void mptcp_rcv_space_adjust(struct mptcp_soc= k *msk, int copied) msk->rcvq_space.time =3D mstamp; } =20 -static struct mptcp_subflow_context * -__mptcp_first_ready_from(struct mptcp_sock *msk, - struct mptcp_subflow_context *subflow) -{ - struct mptcp_subflow_context *start_subflow =3D subflow; - - while (!READ_ONCE(subflow->data_avail)) { - subflow =3D mptcp_next_subflow(msk, subflow); - if (subflow =3D=3D start_subflow) - return NULL; - } - return subflow; -} - static bool __mptcp_move_skbs(struct sock *sk) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk =3D mptcp_sk(sk); bool ret =3D false; =20 - if (list_empty(&msk->conn_list)) - return false; - - subflow =3D list_first_entry(&msk->conn_list, - struct mptcp_subflow_context, node); - for (;;) { + mptcp_for_each_subflow(msk, subflow) { struct sock *ssk; bool slowpath; =20 @@ -2149,23 +2134,22 @@ static bool __mptcp_move_skbs(struct sock *sk) if (sk_rmem_alloc_get(sk) > sk->sk_rcvbuf) break; =20 - subflow =3D __mptcp_first_ready_from(msk, subflow); - if (!subflow) - break; + if (!subflow->data_delayed) + continue; =20 ssk =3D mptcp_subflow_tcp_sock(subflow); slowpath =3D lock_sock_fast(ssk); - ret =3D __mptcp_move_skbs_from_subflow(msk, ssk, true) || ret; + ret |=3D __mptcp_move_skbs_from_subflow(msk, ssk, true); if (unlikely(ssk->sk_err)) __mptcp_error_report(sk); unlock_sock_fast(ssk, slowpath); - - subflow =3D mptcp_next_subflow(msk, subflow); } =20 __mptcp_ofo_queue(msk); - if (ret) + if (ret) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DELAYED_PROCESS); mptcp_check_data_fin((struct sock *)msk); + } return ret; } =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 7bfd4e0d21a8a..a295ce11774ea 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -560,6 +560,7 @@ struct mptcp_subflow_context { u8 reset_transient:1; u8 reset_reason:4; u8 stale_count; + bool data_delayed; =20 u32 subflow_id; =20 --=20 2.51.0 From nobody Sat Oct 11 05:54:48 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B76D831CA57 for ; Fri, 19 Sep 2025 15:53:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297239; cv=none; b=NxuYqaCgr1ZGlM8fmIaAH1O5ODQ1V8dhwxqSkfAIHHbbz4Vi145OcGif1Vr4+tfZWxzjTyB69C+c5hFHE8yZ4eHTtDiB06p9AldSl2acejP1FSaJGJhakMPxkG+7If15gOO8CigWbPf59jWrx12xSbPlA1FB6TKpgNEMmGDNyZ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758297239; c=relaxed/simple; bh=I/pvKuIoRLKrd5FO2JvcUHBy8aiteUrpeS3WW5nnuUk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=IRfUTFObH+LS0xCLyoMZJxA9Pkp3AqASjLiCD0SMTuX5hee5In8byopj7UmnI7o3vtlQuKeYQ9ORywV8GTR/aIvqx6cF+2KX2dEAK4Qalj17nGXCJzaKzwl/S8Pc13t9soZ/r4GWSeRL8YJhcm3RpRbzPUs0bZtHW0M/N/S8QtE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bu91/Tuk; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bu91/Tuk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758297236; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fjb+cIf+fmWWN/Wpl86mlEWkemfac6TNnJd/yK2ZTHc=; b=bu91/TukE8G9Vc4cwOHxLaJ0tLlfW6mS1jBVxzBWTWzXyiKTLNdqkJk+uR7A4HN+4aBnKO XXRb4P6HbU7+VXS/uGbfim5W2AaF7GpGGppXqGXv25jZNAN7SNfEV9vkfJOnSO9WUvwcki njN9ee0sReA7GRz5tjw8Fd+YGEkKpP8= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-473-nC7uaJAmNYmSErJ2wZ7wpw-1; Fri, 19 Sep 2025 11:53:55 -0400 X-MC-Unique: nC7uaJAmNYmSErJ2wZ7wpw-1 X-Mimecast-MFC-AGG-ID: nC7uaJAmNYmSErJ2wZ7wpw_1758297234 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 626661800350 for ; Fri, 19 Sep 2025 15:53:54 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.255]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 862F01800452 for ; Fri, 19 Sep 2025 15:53:53 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [MPTCP next v3 12/12] mptcp: borrow forward memory from subflow Date: Fri, 19 Sep 2025 17:53:26 +0200 Message-ID: <34ad291e860dfa1bc1c7d5fe3cc5a07820bec571.1758296923.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: vI0S_Wpw0ggFsec7J_kQTlvnGEVMirT1ytSauxng8MM_1758297234 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" In the MPTCP receive path, we release the subflow allocated fwd memory just to allocate it again shortly after for the msk. That could increases the failures chances, especially during backlog processing, when other actions could consume the just released memory before the msk socket has a chance to do the rcv allocation. Replace the skb_orphan() call with an open-coded variant that explicitly borrows, with a PAGE_SIZE granularity, the fwd memory from the subflow socket instead of releasing it. During backlog processing the borrowed memory is accounted at release_cb time. Signed-off-by: Paolo Abeni Reviewed-by: Geliang Tang Tested-by: Geliang Tang --- v1 -> v2: - rebased - explain why skb_orphan is removed --- net/mptcp/protocol.c | 27 +++++++++++++++++++++------ net/mptcp/protocol.h | 1 + 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 2a025c0c4ca0c..7db5adb43d41b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -338,11 +338,12 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *m= sk, struct sk_buff *skb) mptcp_rcvbuf_grow(sk); } =20 -static void mptcp_init_skb(struct sock *ssk, - struct sk_buff *skb, int offset, int copy_len) +static int mptcp_init_skb(struct sock *ssk, + struct sk_buff *skb, int offset, int copy_len) { const struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); bool has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; + int borrowed; =20 /* the skb map_seq accounts for the skb offset: * mptcp_subflow_get_mapped_dsn() is based on the current tp->copied_seq @@ -358,6 +359,15 @@ static void mptcp_init_skb(struct sock *ssk, =20 skb_ext_reset(skb); skb_dst_drop(skb); + + /* "borrow" the fwd memory from the subflow, instead of reclaiming it */ + skb->destructor =3D NULL; + skb->sk =3D NULL; + atomic_sub(skb->truesize, &ssk->sk_rmem_alloc); + borrowed =3D ssk->sk_forward_alloc - sk_unused_reserved_mem(ssk); + borrowed &=3D ~(PAGE_SIZE - 1); + sk_forward_alloc_add(ssk, skb->truesize - borrowed); + return borrowed; } =20 static void __mptcp_add_backlog(struct sock *sk, struct sock *ssk, @@ -717,14 +727,17 @@ static bool __mptcp_move_skbs_from_subflow(struct mpt= cp_sock *msk, =20 if (offset < skb->len) { size_t len =3D skb->len - offset; + int bmem; =20 - mptcp_init_skb(ssk, skb, offset, len); - skb_orphan(skb); + bmem =3D mptcp_init_skb(ssk, skb, offset, len); =20 - if (own_msk) + if (own_msk) { + sk_forward_alloc_add(sk, bmem); ret |=3D __mptcp_move_skb(sk, skb); - else + } else { + msk->borrowed_fwd_mem +=3D bmem; __mptcp_add_backlog(sk, ssk, skb); + } seq +=3D len; =20 if (unlikely(map_remaining < len)) { @@ -3514,6 +3527,8 @@ static void mptcp_release_cb(struct sock *sk) if (__test_and_clear_bit(MPTCP_SYNC_SNDBUF, &msk->cb_flags)) __mptcp_sync_sndbuf(sk); } + sk_forward_alloc_add(sk, msk->borrowed_fwd_mem); + msk->borrowed_fwd_mem =3D 0; } =20 /* MP_JOIN client subflow must wait for 4th ack before sending any data: diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index a295ce11774ea..ff87dd9a0da5a 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -298,6 +298,7 @@ struct mptcp_sock { u32 last_data_sent; u32 last_data_recv; u32 last_ack_recv; + int borrowed_fwd_mem; unsigned long timer_ival; u32 token; unsigned long flags; --=20 2.51.0