From nobody Sat May 30 15:33:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B433B3C454F for ; Fri, 29 May 2026 10:44:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051491; cv=none; b=dP8yRmBaJVwC0BZyX7YxEundROH5Nyu0zbLOQyI8Py56OrDLoP0Bk6mY5aOCdlWw4zD7pb8ymfZJEhUKqZGLgaEv53/7XmjDbYTqBXmqVkNL0mSHh2R9FQPjfHH8BwvK1fFrEJKt6FkW+GhsmHNXJhEraOrw24e2gRzsG6FNe7U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051491; c=relaxed/simple; bh=MldmtXDorhmIBYhwqmMUse0+Jmfv7UoaoLUN0QpDGqw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=fpgk5ixy1h4BBrZiyAz+aEuWxUpWUuul/0Y+Akv55cFqOur//1/yP5texGfCV2mRPX2TsI4vOVqWkVjmbv4Ea31pw0WwUY5a/FEeV/Zc8v2WKVUbQ9l5b5TF9eUxpx0xK0kAB7DXdf1wsIayT+5AA1FGYQg/YPGq6Ifx9drO3es= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=P2qn9qiX; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="P2qn9qiX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780051489; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Ni0R1/P1iP4W4/HjqyWOb5R/51zF+yeImHzPpCTTz4=; b=P2qn9qiXD8MFs5WzpQ2Q/PMj7UYMveh2Pv62ZNQDUBCmBOOJXjY+sAPCdXNBLURQtIEKa0 K2HHBF9uh3ece1IMYcY5UE+yxOyMREPxloYwV7BEZ3AdaVNc8grOYNsI9tqGM1e9YirEqf gaAqAxL0D4Y4OiBdnJ6cmu0J8pbjl8U= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-320-h2LwUPOwMXS8ciFV-jZIXQ-1; Fri, 29 May 2026 06:44:47 -0400 X-MC-Unique: h2LwUPOwMXS8ciFV-jZIXQ-1 X-Mimecast-MFC-AGG-ID: h2LwUPOwMXS8ciFV-jZIXQ_1780051486 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 685CE1956048 for ; Fri, 29 May 2026 10:44:46 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.235]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9E4DF19560A3 for ; Fri, 29 May 2026 10:44:45 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v10 mptcp-next 1/3] Squash-to: "mptcp: implemented OoO queue pruning" Date: Fri, 29 May 2026 12:44:29 +0200 Message-ID: <34aad036969501b2e71025a68bd3925e478bc3c8.1780049797.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ts4n9eo3aZ3PCMcqO6ACZZWSi9y_tDTkHJjSBUO3Drc_1780051486 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" As discussed in the original patch, use the proper helper to get the currently used memory, and let mptcp_prune_ofo_queue() return true when skb can be accepted, to avoid code churn in the caller. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 29cb10c02ed8..03234e8cc26c 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -376,7 +376,7 @@ static void mptcp_init_skb(struct sock *ssk, struct sk_= buff *skb, int offset, /* "Inspired" from the TCP version; main difference: stop as soon as the M= PTCP * socket is under memory limit. */ -static void mptcp_prune_ofo_queue(struct sock *sk, u64 seq) +static bool mptcp_prune_ofo_queue(struct sock *sk, u64 seq) { struct mptcp_sock *msk =3D mptcp_sk(sk); struct rb_node *node, *prev; @@ -384,7 +384,7 @@ static void mptcp_prune_ofo_queue(struct sock *sk, u64 = seq) u64 mem; =20 if (RB_EMPTY_ROOT(&msk->out_of_order_queue)) - return; + goto out; =20 node =3D &msk->ooo_last_skb->rbnode; =20 @@ -401,7 +401,7 @@ static void mptcp_prune_ofo_queue(struct sock *sk, u64 = seq) mptcp_drop(sk, skb); msk->ooo_last_skb =3D rb_to_skb(prev); =20 - mem =3D (unsigned int)atomic_read(&sk->sk_rmem_alloc); + mem =3D (unsigned int)sk_rmem_alloc_get(sk); if (mem < sk->sk_rcvbuf) break; =20 @@ -410,6 +410,10 @@ static void mptcp_prune_ofo_queue(struct sock *sk, u64= seq) =20 if (pruned) MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_OFO_PRUNED); + +out: + mem =3D (unsigned int)sk_rmem_alloc_get(sk); + return mem < sk->sk_rcvbuf; } =20 static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb) @@ -424,13 +428,11 @@ static bool __mptcp_move_skb(struct sock *sk, struct = sk_buff *skb) * will break. */ if (unlikely(sk_rmem_alloc_get(sk) > READ_ONCE(sk->sk_rcvbuf)) && - !__mptcp_check_fallback(msk)) { - mptcp_prune_ofo_queue(sk, MPTCP_SKB_CB(skb)->map_seq); - if (sk_rmem_alloc_get(sk) > READ_ONCE(sk->sk_rcvbuf)) { - MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); - mptcp_drop(sk, skb); - return false; - } + !__mptcp_check_fallback(msk) && + !mptcp_prune_ofo_queue(sk, MPTCP_SKB_CB(skb)->map_seq)) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); + mptcp_drop(sk, skb); + return false; } =20 if (MPTCP_SKB_CB(skb)->map_seq =3D=3D msk->ack_seq) { --=20 2.54.0 From nobody Sat May 30 15:33:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D81B3C454F for ; Fri, 29 May 2026 10:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051493; cv=none; b=daUOXshOQtPXtTd2c9TXNW/aKrbp1BvWa6COAFZ6+E3B2QVxu/NNM0ylcA4DBwWShXSPNY4cmWX0K9tQsHeiEftoZFUyCn0SVGIsehFcHqBaZI/x5u8n1mFxCP/pVrf8Rl+wpWvZBK8VMMfPqbg31TMwGRgizbzktsEcyyUVG/E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051493; c=relaxed/simple; bh=tLIoZ0X+8OVh+2FxIO0d/v/7KUopjB+LKXZ95YApcWk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=U74MQGHy3io7daT0W9hJ0X0IAUbI0cBdR1/v6vZ7taaFpQZI2S77idcY6S68eS7/026a1/eMGMS3tXPPq2feNlW0Dkj1gOToCPpGPQ9y+aalPx4C0C1zw25OKeE97f0gcat7K+YqT7Jx2Vkl967bItlxCHAlwZv7M4hAbzyiKF8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HhmJxhgL; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HhmJxhgL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780051491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZfYrL32wqJfYkOGUtZju/LDH7ze1uekDno+OFGYpcFo=; b=HhmJxhgL/JkxxpUjFPBCXzTbVtKPCC+OfaQ3VJqsI/0Z0xATcb1vbUKNL+Nt6HqB9TJ7HR cBYrNnrnRx8Qfrn3PrxB8WtDSi1SHI3m6rZaDKRpmnRrRtDjvT4r+dCuDedMuvyDfhS1yk aSG0DhvMB/LcNEvH2F3Cs8J74YdaGZk= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-453-ckbv2W2jOfWW-HOANOksUg-1; Fri, 29 May 2026 06:44:48 -0400 X-MC-Unique: ckbv2W2jOfWW-HOANOksUg-1 X-Mimecast-MFC-AGG-ID: ckbv2W2jOfWW-HOANOksUg_1780051487 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CDFBD1800359 for ; Fri, 29 May 2026 10:44:47 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.235]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0F9EE19560A3 for ; Fri, 29 May 2026 10:44:46 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v10 mptcp-next 2/3] mptcp: move the retrans loop to a separate helper Date: Fri, 29 May 2026 12:44:30 +0200 Message-ID: <5d88f8b6cf46c0afd2ccd31332faf3a57dc5dff3.1780049797.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: _fd7fvAuZVeFw4isi3aK9WR0atVyYa3fUnK2KE7UK-k_1780051487 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" This is a cleanup in order to make the next patch simpler. No functional change intended. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 74 +++++++++++++++++++++++++------------------- 1 file changed, 43 insertions(+), 31 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 03234e8cc26c..51756800edc2 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2830,41 +2830,14 @@ static void mptcp_check_fastclose(struct mptcp_sock= *msk) sk_error_report(sk); } =20 -static void __mptcp_retrans(struct sock *sk) +/* Retransmit the specified data fragment on all the selected subflows. */ +static int __mptcp_push_retrans(struct sock *sk, struct mptcp_data_frag *d= frag) { struct mptcp_sendmsg_info info =3D { .data_lock_held =3D true, }; struct mptcp_sock *msk =3D mptcp_sk(sk); struct mptcp_subflow_context *subflow; - struct mptcp_data_frag *dfrag; struct sock *ssk; - int ret, err; - u16 len =3D 0; - - mptcp_clean_una_wakeup(sk); - - /* first check ssk: need to kick "stale" logic */ - err =3D mptcp_sched_get_retrans(msk); - dfrag =3D mptcp_rtx_head(sk); - if (!dfrag) { - if (mptcp_data_fin_enabled(msk)) { - struct inet_connection_sock *icsk =3D inet_csk(sk); - - WRITE_ONCE(icsk->icsk_retransmits, - icsk->icsk_retransmits + 1); - mptcp_set_datafin_timeout(sk); - mptcp_send_ack(msk); - - goto reset_timer; - } - - if (!mptcp_send_head(sk)) - goto clear_scheduled; - - goto reset_timer; - } - - if (err) - goto reset_timer; + int ret, len =3D 0; =20 mptcp_for_each_subflow(msk, subflow) { if (READ_ONCE(subflow->scheduled)) { @@ -2892,7 +2865,7 @@ static void __mptcp_retrans(struct sock *sk) !msk->allow_subflows) { spin_unlock_bh(&msk->fallback_lock); release_sock(ssk); - goto clear_scheduled; + return -1; } =20 while (info.sent < info.limit) { @@ -2915,6 +2888,45 @@ static void __mptcp_retrans(struct sock *sk) release_sock(ssk); } } + return len; +} + +static void __mptcp_retrans(struct sock *sk) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct mptcp_subflow_context *subflow; + struct mptcp_data_frag *dfrag; + int err, len; + + mptcp_clean_una_wakeup(sk); + + /* first check ssk: need to kick "stale" logic */ + err =3D mptcp_sched_get_retrans(msk); + dfrag =3D mptcp_rtx_head(sk); + if (!dfrag) { + if (mptcp_data_fin_enabled(msk)) { + struct inet_connection_sock *icsk =3D inet_csk(sk); + + WRITE_ONCE(icsk->icsk_retransmits, + icsk->icsk_retransmits + 1); + mptcp_set_datafin_timeout(sk); + mptcp_send_ack(msk); + + goto reset_timer; + } + + if (!mptcp_send_head(sk)) + goto clear_scheduled; + + goto reset_timer; + } + + if (err) + goto reset_timer; + + len =3D __mptcp_push_retrans(sk, dfrag); + if (len < 0) + goto clear_scheduled; =20 msk->bytes_retrans +=3D len; dfrag->already_sent =3D max(dfrag->already_sent, len); --=20 2.54.0 From nobody Sat May 30 15:33:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B07C3C1413 for ; Fri, 29 May 2026 10:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051493; cv=none; b=JAEe95NlaYPZQ69Ir8tGJ/t+Tzf2w5XJIR6Zf87OTM0xt7cH7xocmn6eaBDqQkCWQNaZhLioORtqzlPoS+IvNNy8qohwyJiHoHmD3g0K3hL5wUgxVhiuOsRedkEhTnmhGu6ENQ6Z7OfFjqzIQOdK7Gor5oUyLbUhb2sa8h3xcxs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051493; c=relaxed/simple; bh=RpOwwfwa3cLRMp+IegdeWs295SkedN5Sq07ztIGzztU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=X+X9Q61Z7vqXGnAD8pPwK9t0tbyUVf7Ip4VL7E3OY5D7VqtNiTr01D84l8GDwPPKMENHQxshGN8IRdICln28LnTwMsT51sueD3Wk3Cs7yQmr+OmCwZIKBBZ8FPkCCkY/j59fyeUD6NzdhMf2U25qWaRTlS99d2S9/1nU5P60CGw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=PCWioy6E; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PCWioy6E" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780051491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=whqzQlGcU623K1X7WjTsTJhi9OloiCONFW7uUD604pQ=; b=PCWioy6Eo/uitBxnmDtkyDcryNe3iLTFuaMVJgl8YeCEAEWguVbM6H9RaotGrmjiFsZgCw If3bMKF7D0VHDqadz4zc9Pc1UcjR9vYSXVNpOhAMQZ5znm5ZpTnsYY4Evt42K9PuTMOfMh zd7zCRmf5F3JS273Pv/JgouCXMPOqt0= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-307-kK6eyssOMWeTGtqBPWuHbQ-1; Fri, 29 May 2026 06:44:50 -0400 X-MC-Unique: kK6eyssOMWeTGtqBPWuHbQ-1 X-Mimecast-MFC-AGG-ID: kK6eyssOMWeTGtqBPWuHbQ_1780051489 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2F87819560AA for ; Fri, 29 May 2026 10:44:49 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.32.235]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4DD6B19560A3 for ; Fri, 29 May 2026 10:44:48 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v10 mptcp-next 3/3] mptcp: let the retrans scheduler do its job. Date: Fri, 29 May 2026 12:44:31 +0200 Message-ID: <199d3b27cc1f092a1bca686963103f922d0d27cd.1780049797.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: n5aV_BJpBlowyJijws7yWWIdv0mlmmyalX2Bo2nHbZU_1780051489 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Currently the MPTCP core enforces that when MPTCP-level retrans timer fires, at most a single dfrag is retransmitted. If some corner-cases it may be necessary retransmit multiple dfrags, and the MPTCP socket will need to wait multiple retrans timeout to accomplish that. Remove the mentioned constraint, allowing to transmit multiple dfrags per retrans period, as long as the scheduler keeps selecting subflows for retransmissions and pending data is available in the rtx queue. The default scheduler will transmit a dfrag per available subflow. Signed-off-by: Paolo Abeni --- v9 -> v10: - simpler handling for data-fin rtx v7 -> v8 - fix corner-case retrans_seq update v4 -> v5: - fixed already_sent update v3 -> v4: - avoid quadratic behavior, fix retrans_seq update - fix rtx timer re-schedule miss v2 -> v3: - fix infinite loop issue (should address tls tests failures) v1 -> v2: - fix retrans sequence update (sashiko) Note: - sashiko may see issues when dfrag =3D mptcp_rtx_head(sk) !=3D NULL and dfrag->already_sent =3D=3D 0. That condition should not possible: if mptcp_rtx_head() is not NULL there should be some data already sent. - sashiko may see missing data-fin rtx when the initial `dfrag` is not NULL. data-fin RTX is NOT needed in such scenario. --- net/mptcp/protocol.c | 119 +++++++++++++++++++++++++++++++------------ 1 file changed, 87 insertions(+), 32 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 51756800edc2..7fe618e22d1b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1201,13 +1201,6 @@ static void __mptcp_clean_una_wakeup(struct sock *sk) mptcp_write_space(sk); } =20 -static void mptcp_clean_una_wakeup(struct sock *sk) -{ - mptcp_data_lock(sk); - __mptcp_clean_una_wakeup(sk); - mptcp_data_unlock(sk); -} - static void mptcp_enter_memory_pressure(struct sock *sk) { struct mptcp_subflow_context *subflow; @@ -2830,8 +2823,12 @@ static void mptcp_check_fastclose(struct mptcp_sock = *msk) sk_error_report(sk); } =20 -/* Retransmit the specified data fragment on all the selected subflows. */ -static int __mptcp_push_retrans(struct sock *sk, struct mptcp_data_frag *d= frag) +/* + * Retransmit the specified data fragment on all the selected subflows, + * starting from the specified sequence + */ +static int __mptcp_push_retrans(struct sock *sk, struct mptcp_data_frag *d= frag, + u64 sent_seq) { struct mptcp_sendmsg_info info =3D { .data_lock_held =3D true, }; struct mptcp_sock *msk =3D mptcp_sk(sk); @@ -2841,6 +2838,7 @@ static int __mptcp_push_retrans(struct sock *sk, stru= ct mptcp_data_frag *dfrag) =20 mptcp_for_each_subflow(msk, subflow) { if (READ_ONCE(subflow->scheduled)) { + u16 offset =3D sent_seq - dfrag->data_seq; u16 copied =3D 0; =20 mptcp_subflow_set_scheduled(subflow, false); @@ -2850,9 +2848,12 @@ static int __mptcp_push_retrans(struct sock *sk, str= uct mptcp_data_frag *dfrag) lock_sock(ssk); =20 /* limit retransmission to the bytes already sent on some subflows */ - info.sent =3D 0; + info.sent =3D offset; info.limit =3D READ_ONCE(msk->csum_enabled) ? dfrag->data_len : dfrag->already_sent; + DEBUG_NET_WARN_ON_ONCE(!before64(sent_seq, + dfrag->data_seq + + info.limit)); =20 /* * make the whole retrans decision, xmit, disallow @@ -2896,14 +2897,85 @@ static void __mptcp_retrans(struct sock *sk) struct mptcp_sock *msk =3D mptcp_sk(sk); struct mptcp_subflow_context *subflow; struct mptcp_data_frag *dfrag; + bool need_retrans; + u64 retrans_seq; int err, len; =20 - mptcp_clean_una_wakeup(sk); - - /* first check ssk: need to kick "stale" logic */ - err =3D mptcp_sched_get_retrans(msk); + mptcp_data_lock(sk); + __mptcp_clean_una_wakeup(sk); + retrans_seq =3D msk->snd_una; dfrag =3D mptcp_rtx_head(sk); - if (!dfrag) { + need_retrans =3D !!dfrag; + mptcp_data_unlock(sk); + if (!dfrag) + goto check_data_fin; + + for (;;) { + bool already_retrans; + u64 sent_seq; + + /* The default scheduler will kick "stale" logic, that in + * turn can process incoming acks and clean the RTX queue; + * ensure that the current dfrag will still be around + * afterwards. + */ + get_page(dfrag->page); + err =3D mptcp_sched_get_retrans(msk); + if (err) { + put_page(dfrag->page); + break; + } + + /* Incoming acks can have moved retrans sequence after + * the current dfrag, if so try to start again from RTX head. + */ + mptcp_data_lock(sk); + already_retrans =3D !dfrag->already_sent || + !before64(msk->snd_una, dfrag->data_seq + + dfrag->already_sent); + put_page(dfrag->page); + if (already_retrans) { + __mptcp_clean_una_wakeup(sk); + retrans_seq =3D msk->snd_una; + dfrag =3D mptcp_rtx_head(sk); + need_retrans =3D !!dfrag; + } else if (after64(msk->snd_una, retrans_seq)) { + retrans_seq =3D msk->snd_una; + } + mptcp_data_unlock(sk); + if (!dfrag) + break; + + /* Can fail only in case of fallback. */ + len =3D __mptcp_push_retrans(sk, dfrag, retrans_seq); + if (len < 0) + goto clear_scheduled; + + retrans_seq +=3D len; + msk->bytes_retrans +=3D len; + dfrag->already_sent =3D max_t(u16, dfrag->already_sent, + retrans_seq - dfrag->data_seq); + + /* With csum enabled retransmission can send new data. */ + sent_seq =3D dfrag->already_sent + dfrag->data_seq; + if (after64(sent_seq, msk->snd_nxt)) + WRITE_ONCE(msk->snd_nxt, sent_seq); + + /* Attempt the next fragment only if the current one is + * completely retransmitted. + */ + if (before64(retrans_seq, dfrag->data_seq + dfrag->data_len)) + break; + + dfrag =3D list_is_last(&dfrag->list, &msk->rtx_queue) ? + NULL : list_next_entry(dfrag, list); + if (!dfrag || !dfrag->already_sent) + break; + } + + /* Attempt data-fin retransmission only when the RTX queue is empty. */ + if (!need_retrans) { +check_data_fin: if (mptcp_data_fin_enabled(msk)) { struct inet_connection_sock *icsk =3D inet_csk(sk); =20 @@ -2911,30 +2983,13 @@ static void __mptcp_retrans(struct sock *sk) icsk->icsk_retransmits + 1); mptcp_set_datafin_timeout(sk); mptcp_send_ack(msk); - goto reset_timer; } =20 if (!mptcp_send_head(sk)) goto clear_scheduled; - - goto reset_timer; } =20 - if (err) - goto reset_timer; - - len =3D __mptcp_push_retrans(sk, dfrag); - if (len < 0) - goto clear_scheduled; - - msk->bytes_retrans +=3D len; - dfrag->already_sent =3D max(dfrag->already_sent, len); - - /* With csum enabled retransmission can send new data. */ - if (after64(dfrag->already_sent + dfrag->data_seq, msk->snd_nxt)) - WRITE_ONCE(msk->snd_nxt, dfrag->already_sent + dfrag->data_seq); - reset_timer: mptcp_check_and_set_pending(sk); =20 --=20 2.54.0