From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B165836212B; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; cv=none; b=t6MDNdtrbdpJScmSXWhSESA93gIm1hsb1iX/FXuIJtD/fUog0ChpkdoZUNM1NGYHm3GgFH47ReOjU3X3JtnxsXYby2gwhbnGeTZcbD0e7357MEUDvOfAJchgpMi20IDmhLI+5sKjLKFaKdC5JejMfH59qBjIOcLhBjH6466W/yQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; c=relaxed/simple; bh=6lpGQhvEHoauKsBpF2oCmP3IaU50PTmtKr43V5GFemI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DeF/1WnLKmbPRuVhW0r5NPA7MRy3gzkakNrKpycBuDnGcs7OL8stFodKHaVTbF8uvvO5Jd8BQIwdIH9kDy8ao3Wdb5qMhKEv8M4dmd6EzSaUxrDTYJcSGo+ftx+SexgkZnOJ3UR21qM9S1h3ldOcHq4YzxaoOsjfyZcJUznf8S8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=A2F3SSC+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="A2F3SSC+" Received: by smtp.kernel.org (Postfix) with ESMTPS id 5F219C19423; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=6lpGQhvEHoauKsBpF2oCmP3IaU50PTmtKr43V5GFemI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=A2F3SSC+psc+/q4F013fuYWvEsxH+vn35ViMvHIxlHOJ3jxtkMDZw97V1SgLeoJ0D PvlmT0Muzo9KGOzL/w05wIqqcwrya1VqcMMerMv4EjDjr23WD18wfkWr2CRs3RFIgo NBy7QszSuBpTfzrzpBgwXnqobeU81oGHtSfw0JA0LDTR5nPVaGHzDZQE/sVGH/H8U5 MDELMpYhDkoKRIhXLBe+S3EjZwEl9CXzPuDhSZheEXrhw/zYOk0JQ1DM7JbV7j5I1N AOoMsYSqs8hL50G7dQ5lzbn6+Z3jtXQGwotDT5+n24VH5DzNkf3tIIZTCm7eeSMnx/ DfEqNEbh1ByBw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D0E1EF3706; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:26 +0100 Subject: [PATCH net-next v3 1/6] tcp: implement RFC 7323 window retraction receiver requirements Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=11695; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=9E8zCE5nmQbTZ9o2NygPjxIomKughsAZvjNUYtJajfw=; b=V8tdU1ByKFW0d7c32JNgV3KR3ohZisAHDtPmNZ0FbMNAZI0ZP5ipHEi71GROA7dyV2dvT5cMW 1iu9BHemoNjDvZ0LWEx34wL4+1Hwzsl0TVCKUWsJfKmf8wZLEcmXrEs X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz By default, the Linux TCP implementation does not shrink the advertised window (RFC 7323 calls this "window retraction") with the following exceptions: - When an incoming segment cannot be added due to the receive buffer running out of memory. Since commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze") a zero window will be advertised in this case. It turns out that reaching the required memory pressure is easy when window scaling is in use. In the simplest case, sending a sufficient number of segments smaller than the scale factor to a receiver that does not read data is enough. - Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by allowing the tcp window to shrink") addressed the "eating memory" problem by introducing a sysctl knob that allows shrinking the window before running out of memory. However, RFC 7323 does not only state that shrinking the window is necessary in some cases, it also formulates requirements for TCP implementations when doing so (Section 2.4). This commit addresses the receiver-side requirements: After retracting the window, the peer may have a snd_nxt that lies within a previously advertised window but is now beyond the retracted window. This means that all incoming segments (including pure ACKs) will be rejected until the application happens to read enough data to let the peer's snd_nxt be in window again (which may be never). To comply with RFC 7323, the receiver MUST honor any segment that would have been in window for any ACK sent by the receiver and, when window scaling is in effect, SHOULD track the maximum window sequence number it has advertised. This patch tracks that maximum window sequence number rcv_mwnd_seq throughout the connection and uses it in tcp_sequence() when deciding whether a segment is acceptable. rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in tcp_select_window(). If we count tcp_sequence() as fast path, it is read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's cacheline group. The logic for handling received data in tcp_data_queue() is already sufficient and does not need to be updated. Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet --- .../networking/net_cachelines/tcp_sock.rst | 1 + include/linux/tcp.h | 3 +++ include/net/tcp.h | 22 ++++++++++++++++++= ++++ net/ipv4/tcp.c | 2 ++ net/ipv4/tcp_fastopen.c | 1 + net/ipv4/tcp_input.c | 10 +++++----- net/ipv4/tcp_minisocks.c | 1 + net/ipv4/tcp_output.c | 3 +++ .../net/packetdrill/tcp_rcv_big_endseq.pkt | 2 +- 9 files changed, 39 insertions(+), 6 deletions(-) diff --git a/Documentation/networking/net_cachelines/tcp_sock.rst b/Documen= tation/networking/net_cachelines/tcp_sock.rst index 563daea10d6c5c074f004cb1b8574f5392157abb..fecf61166a54ee2f64bcef5312c= 81dcc4aa9a124 100644 --- a/Documentation/networking/net_cachelines/tcp_sock.rst +++ b/Documentation/networking/net_cachelines/tcp_sock.rst @@ -121,6 +121,7 @@ u64 delivered_mstamp r= ead_write u32 rate_delivered = read_mostly tcp_rate_gen u32 rate_interval_us = read_mostly rate_delivered,rate_app_limited u32 rcv_wnd read_write = read_mostly tcp_select_window,tcp_receive_window,tcp_fast_path_check +u32 rcv_mwnd_seq read_write = tcp_select_window u32 write_seq read_write = tcp_rate_check_app_limited,tcp_write_queue_empty,tcp_sk= b_entail,forced_push,tcp_mark_push u32 notsent_lowat read_mostly = tcp_stream_memory_free u32 pushed_seq read_write = tcp_mark_push,forced_push diff --git a/include/linux/tcp.h b/include/linux/tcp.h index f72eef31fa23cc584f2f0cefacdc35cae43aa52d..73aa2e0ccd1d7a6314a00c27950= b019b62a3851c 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -316,6 +316,9 @@ struct tcp_sock { */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ + u32 rcv_mwnd_seq; /* Maximum window sequence number (RFC 7323, + * section 2.4, receiver requirements) + */ u32 rcv_tstamp; /* timestamp of last received ACK (for keepalives) */ /* * Options received (usually on last packet, some only on SYN packets= ). diff --git a/include/net/tcp.h b/include/net/tcp.h index a6464142380696e4948a836145ac7aca4ca3ec15..5fa8455ee9bc52d1434feaf82dd= a80be067a36e6 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -921,6 +921,28 @@ static inline u32 tcp_receive_window(const struct tcp_= sock *tp) return (u32) win; } =20 +/* Compute the maximum receive window we ever advertised. + * Rcv_nxt can be after the window if our peer push more data + * than the offered window. + */ +static inline u32 tcp_max_receive_window(const struct tcp_sock *tp) +{ + s32 win =3D tp->rcv_mwnd_seq - tp->rcv_nxt; + + if (win < 0) + win =3D 0; + return (u32) win; +} + +/* Check if we need to update the maximum receive window sequence number */ +static inline void tcp_update_max_rcv_wnd_seq(struct tcp_sock *tp) +{ + u32 wre =3D tp->rcv_wup + tp->rcv_wnd; + + if (after(wre, tp->rcv_mwnd_seq)) + tp->rcv_mwnd_seq =3D wre; +} + /* Choose a new window, without checks for shrinking, and without * scaling applied to the result. The caller does these things * if necessary. This is a "raw" window selection. diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index ed6f6712f06076dc33af61947782bde436dde15e..516087c622ade78883ca41e4f88= 3740e305035a0 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3561,6 +3561,7 @@ static int tcp_repair_set_window(struct tcp_sock *tp,= sockptr_t optbuf, int len) =20 tp->rcv_wnd =3D opt.rcv_wnd; tp->rcv_wup =3D opt.rcv_wup; + tp->rcv_mwnd_seq =3D opt.rcv_wup + opt.rcv_wnd; =20 return 0; } @@ -5275,6 +5276,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, recei= ved_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_l= imited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_w= nd); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_m= wnd_seq); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_t= stamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_op= t); =20 diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 9fdc19accafd23c6ab74bd82f7a7d82de1d60b90..4e389d609f919c17435509c5007= bc3b2a13eac6c 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -377,6 +377,7 @@ static struct sock *tcp_fastopen_create_child(struct so= ck *sk, =20 tcp_rsk(req)->rcv_nxt =3D tp->rcv_nxt; tp->rcv_wup =3D tp->rcv_nxt; + tp->rcv_mwnd_seq =3D tp->rcv_wup + tp->rcv_wnd; /* tcp_conn_request() is sending the SYNACK, * and queues the child into listener accept queue. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 71ac69b7b75e4919f69631a4894421fa4e417c95..2e1b237608150c2e9c9baf73cf0= 47ed0823ca555 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4808,20 +4808,18 @@ static enum skb_drop_reason tcp_sequence(const stru= ct sock *sk, const struct tcphdr *th) { const struct tcp_sock *tp =3D tcp_sk(sk); - u32 seq_limit; =20 if (before(end_seq, tp->rcv_wup)) return SKB_DROP_REASON_TCP_OLD_SEQUENCE; =20 - seq_limit =3D tp->rcv_nxt + tcp_receive_window(tp); - if (unlikely(after(end_seq, seq_limit))) { + if (unlikely(after(end_seq, tp->rcv_nxt + tcp_max_receive_window(tp)))) { /* Some stacks are known to handle FIN incorrectly; allow the * FIN to extend beyond the window and check it in detail later. */ - if (!after(end_seq - th->fin, seq_limit)) + if (!after(end_seq - th->fin, tp->rcv_nxt + tcp_receive_window(tp))) return SKB_NOT_DROPPED_YET; =20 - if (after(seq, seq_limit)) + if (after(seq, tp->rcv_nxt + tcp_max_receive_window(tp))) return SKB_DROP_REASON_TCP_INVALID_SEQUENCE; =20 /* Only accept this packet if receive queue is empty. */ @@ -6903,6 +6901,7 @@ static int tcp_rcv_synsent_state_process(struct sock = *sk, struct sk_buff *skb, */ WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1); tp->rcv_wup =3D TCP_SKB_CB(skb)->seq + 1; + tp->rcv_mwnd_seq =3D tp->rcv_wup + tp->rcv_wnd; =20 /* RFC1323: The window in SYN & SYN/ACK segments is * never scaled. @@ -7015,6 +7014,7 @@ static int tcp_rcv_synsent_state_process(struct sock = *sk, struct sk_buff *skb, WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1); WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); tp->rcv_wup =3D TCP_SKB_CB(skb)->seq + 1; + tp->rcv_mwnd_seq =3D tp->rcv_wup + tp->rcv_wnd; =20 /* RFC1323: The window in SYN & SYN/ACK segments is * never scaled. diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index dafb63b923d0d08cb1a0e9a37d8ec025386a960a..d350d794a959720853ffd8937cf= dc34c03e2ce30 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -604,6 +604,7 @@ struct sock *tcp_create_openreq_child(const struct sock= *sk, newtp->window_clamp =3D req->rsk_window_clamp; newtp->rcv_ssthresh =3D req->rsk_rcv_wnd; newtp->rcv_wnd =3D req->rsk_rcv_wnd; + newtp->rcv_mwnd_seq =3D newtp->rcv_wup + req->rsk_rcv_wnd; newtp->rx_opt.wscale_ok =3D ireq->wscale_ok; if (newtp->rx_opt.wscale_ok) { newtp->rx_opt.snd_wscale =3D ireq->snd_wscale; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index f0ebcc7e287173be6198fd100130e7ba1a1dbf03..c86910d147f2394bf414d7691d8= f90ed41c1b0e3 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -293,6 +293,7 @@ static u16 tcp_select_window(struct sock *sk) tp->pred_flags =3D 0; tp->rcv_wnd =3D 0; tp->rcv_wup =3D tp->rcv_nxt; + tcp_update_max_rcv_wnd_seq(tp); return 0; } =20 @@ -316,6 +317,7 @@ static u16 tcp_select_window(struct sock *sk) =20 tp->rcv_wnd =3D new_win; tp->rcv_wup =3D tp->rcv_nxt; + tcp_update_max_rcv_wnd_seq(tp); =20 /* Make sure we do not exceed the maximum possible * scaled window. @@ -4195,6 +4197,7 @@ static void tcp_connect_init(struct sock *sk) else tp->rcv_tstamp =3D tcp_jiffies32; tp->rcv_wup =3D tp->rcv_nxt; + tp->rcv_mwnd_seq =3D tp->rcv_nxt + tp->rcv_wnd; WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); =20 inet_csk(sk)->icsk_rto =3D tcp_timeout_init(sk); diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt= b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt index 6c0f32c40f19be2a750fc9d69bbf64250cd7b525..12882be10f2e0cf19e6bc7bd247= 9b27c11ce8ac0 100644 --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt @@ -36,7 +36,7 @@ =20 +0 read(4, ..., 100000) =3D 4000 =20 -// If queue is empty, accept a packet even if its end_seq is above wup + r= cv_wnd +// If queue is empty, accept a packet even if its end_seq is above rcv_mwn= d_seq +0 < P. 4001:54001(50000) ack 1 win 257 * > . 1:1(0) ack 54001 win 0 =20 --=20 2.53.0 From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1755363C63; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; cv=none; b=D1e4HZK3n59WHZ9f2qHn52ODD+NRu2Gj1rA0MlsKgQiUehWt5eh17+/4RR1VB8H7WejJf48qY/IT9Oo0J+r0X+yqoZj4d7b+KVB6pi0L1O+ClO4X8NqW3Cn8+upxOnb9L2w1a8uCK9B5L491ye8NepIeGQF0nAAzddNAZ09Bh44= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; c=relaxed/simple; bh=U6eOZs3iNxBCZqrgik+0s/86JK4oBEk09rxPxv6C1jc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=O1OI1Md9Nz9xlujJ+bagvEJpng3jUleXB5jFT9BcG+lwmER4vX4bOmwAj5hEPvuVqYdKWeM1SdPvYvGxzRSck7O3/j1n1+w3srrpbtugiAbxiZ6ZWQg/z3SJ/D430d09v5Hs1jkj9ryxDNVXg93td2eG5pthSFzyj0IavX0629I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Vhjxp+Tz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Vhjxp+Tz" Received: by smtp.kernel.org (Postfix) with ESMTPS id 6EB55C2BC87; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=U6eOZs3iNxBCZqrgik+0s/86JK4oBEk09rxPxv6C1jc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=Vhjxp+TzGcrl9Uu+gCftb9lUUbQwVQq3qygNlH4kUhLqY827Ph0rKVqKGnjPfgfqH NQLCgWuLPGkZOB1Cz6aIa3a29sRLeUX3Xmo/4j8sb31WL4vSN87TkcxOhkg6T8ptB0 8IQNtKBjZqADri0bWm8qgqfFWMjbXMeej1e8yThK7JMCLbHLkN1wwM1VWbXJGEUzf2 jZASe+9elgTFryZ3gc6a3vLm8KmoJuQOn/Ne3AuuXcpWOu1aXLSNhrWGxs65Vk0+BO PTT0wDcJzNLQflk399xUD4I+RsqxjBHbCD9s6G0zBRb0CU+IG832svcsdP5sqmxSlV p0bKeHKK+GhXQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D591EF36FB; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:27 +0100 Subject: [PATCH net-next v3 2/6] mptcp: keep rcv_mwnd_seq in sync with subflow rcv_wnd Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-2-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=1316; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=ChftbPgxBgowAH5jlThlLmD2ZcIdQ5MNv0dOVMZk0bc=; b=ebwxj9jswhY7N+FX93x/tqn0TUOf6ljdixhMtvB9X8ZTBTSLzUET9CMuG0w7U6djoehuAO9z1 K6f847zwt1VANVMgE8WgLKsGJzlx1Qb2yKGdLXsQZf0vswy9hFLWe/W X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz MPTCP shares a receive window across subflows and applies it at the subflow level by adjusting each subflow's rcv_wnd when needed. With the new TCP tracking of the maximum advertised window sequence, rcv_mwnd_seq must stay consistent with these subflow-level rcv_wnd adjustments. Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet Reviewed-by: Matthieu Baerts (NGI0) --- net/mptcp/options.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 43df4293f58bfbd8a8df6bf24b9f15e0f9e238f6..8a1c5698983cff3082d68290626= dd8f1e044527f 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1076,6 +1076,7 @@ static void rwin_update(struct mptcp_sock *msk, struc= t sock *ssk, * resync. */ tp->rcv_wnd +=3D mptcp_rcv_wnd - subflow->rcv_wnd_sent; + tcp_update_max_rcv_wnd_seq(tp); subflow->rcv_wnd_sent =3D mptcp_rcv_wnd; } =20 @@ -1338,8 +1339,9 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struc= t tcphdr *th) */ rcv_wnd_new =3D rcv_wnd_old; win =3D rcv_wnd_old - ack_seq; - tp->rcv_wnd =3D min_t(u64, win, U32_MAX); - new_win =3D tp->rcv_wnd; + new_win =3D min_t(u64, win, U32_MAX); + tp->rcv_wnd =3D new_win; + tcp_update_max_rcv_wnd_seq(tp); =20 /* Make sure we do not exceed the maximum possible * scaled window. --=20 2.53.0 From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8F91364024; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; cv=none; b=qvSeIzs85Oi3g/EU0hs6/L4QCouzi2EXPclzImv7SepexnAjt4VzOOgsNwINw7Czx99OqPz9LzKuR4PKLMwvUP0jbZugJEi4ExUPvaxhcOAKM2murogQmnh2r7ZaV7JgIbvtVBqPsP0sLqPlV5VD+uYIvsQzLEQK1zagDh3rwKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; c=relaxed/simple; bh=PF4BYVQIZV4oiBVr6sVA/qCdRg+/Hpu3GxmXH0vlCp4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GNreYX4b5I+C8PggwE0j3BWckXNpgG8f9qVeFVRvPy5yUl70FnaxftnkTpeYc6sEDXS4MdfpbJ7LPkvSXIIVmrDkFnFmCsYGm1dPh0Sv5h5pys4mGEvIdaVbgDlIy+MB4FTTwJK0D4rpQLd5mUMe8igljxnohowBMDUPuX2I/VA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=it9EN7WQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="it9EN7WQ" Received: by smtp.kernel.org (Postfix) with ESMTPS id 7D273C2BCAF; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=PF4BYVQIZV4oiBVr6sVA/qCdRg+/Hpu3GxmXH0vlCp4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=it9EN7WQc94rTb5cRGgCpBpC+cj2zX5hDKm3THKsESfenOxz/IpKyLewL9fmP2SY9 wh0Ix3wtBOLd1aWKnypJ6OIUAdIEv5uhAoIVMS4eYicq2eVnVWmhPo6Rlrzy3rn0YW FNILMNJbDBgzhhxA4Zc5g0xCcSi7cEhyNEKqcHW1Z87vjIdfpeLCh3f74616n+Z3Wo vlFiJhFoVzQt57qJtPjbWAea/Wi5p4V9GU4fWbW9GmFByJvQtWTKeJ8jtI18AHsnc/ QWPM3OnTWZ5ExHYEMlFTpbSF4onM83wPdbKyiGtmhDneu+UjDtKCamRSYiRE925LMT cvPg6lxa6V7Kg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EE93EF3709; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:28 +0100 Subject: [PATCH net-next v3 3/6] tcp: increase LINUX_MIB_BEYOND_WINDOW for SKB_DROP_REASON_TCP_OVERWINDOW Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-3-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=994; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=h0fO7UL8KcWCVKdqBvyw3pkMf8DPB6tVI+LGsyguBYs=; b=7/WazVRQuzI6bzee3cpoq4pYjQNWmvWmXipOZLpR8MPj4XIv7lRXuRQWQjerVzWTs5qLbOt68 aOgylmzR+b7Brk0uC/+a/cKqwVA5l/sUSkIdC1L1EFb4zNj/dvghxMD X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz Since commit 9ca48d616ed7 ("tcp: do not accept packets beyond window"), the path leading to SKB_DROP_REASON_TCP_OVERWINDOW in tcp_data_queue() is probably dead. However, it can be reached now when tcp_max_receive_window() is larger than tcp_receive_window(). In that case, increment LINUX_MIB_BEYOND_WINDOW as done in tcp_sequence(). Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet --- net/ipv4/tcp_input.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 2e1b237608150c2e9c9baf73cf047ed0823ca555..e6b2f4be7723db14acf2ae528df= 17b6d106b9da9 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5678,6 +5678,7 @@ static void tcp_data_queue(struct sock *sk, struct sk= _buff *skb) if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt + tcp_receive_window(tp))) { reason =3D SKB_DROP_REASON_TCP_OVERWINDOW; + NET_INC_STATS(sock_net(sk), LINUX_MIB_BEYOND_WINDOW); goto out_of_window; } =20 --=20 2.53.0 From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAD6B367F27; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; cv=none; b=mGVYlnS/VJe5vUyrejcV2V/MIwiNUMcTYYI16Oeg474ukcY/I5H1k6G2GIBbFxEN2Nnr3juo1xuxkf4Fn9Sna76Gk4ufaCBO5NeTGgQHU+yWWin4fN/Mn0+oWQNtvy3ETVZpmj0vSC8LeZhiWVijiJctmBU8WsPIEht+Bf9njyY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; c=relaxed/simple; bh=SbrAJGk70EbIW4bgCwqcNeA0pD4BpRhVNFzp7yOtJi8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=R7qoTZK87wWiQnaQvT35x5TMlnthCLIKpHWaO/FI1Hj6eQLV3pIgadzndZPyv/aO3ZgM5h4O/wtjxvhxDsHimAWORMGrrudnwjw1JkpdFjbzQ/4QsHVR/Q3YpZi+I71bfFBnCE6SOOcZmjmXD3sK4uMC9mFE/l70H0T6BV2eo4M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CEv3G+sF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CEv3G+sF" Received: by smtp.kernel.org (Postfix) with ESMTPS id 8C288C2BCB0; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=SbrAJGk70EbIW4bgCwqcNeA0pD4BpRhVNFzp7yOtJi8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=CEv3G+sFENvoUP3eYAHiYa0LR1jTIU2qbjakGQDpCVz2NKn6apDqroVnUAqLZn71F zdJZ9NIyPb3NQ6gqbrs6LEWJWzvvVCuRcQYzdc+B4zb0UGh1yhUKVA/5xT6pQJI/ZJ xkU/BGPGPO8X3BEKOCUHDi0cAdk5OrB2d+6FRz55li8q2ryDYeU1JJGFd/hYtj9tnc ZgjvA2FYLDJA4B82oDopBk0RZLNpkGqLKhFRK37Pq1DVQQWUCz31wwchCbLUP+P7NV IXd4gqSKIEtoLhUKF5A4NEe07DGeBRFaYrM2SqP60p7IKCCwHdyTIhYrhFzVuCg6fR 9FlP0+xg7V4QQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 824F6EF3706; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:29 +0100 Subject: [PATCH net-next v3 4/6] selftests/net: packetdrill: add tcp_rcv_wnd_shrink_nomem.pkt Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-4-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=6675; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=N/81u3UphdzG27V80GBzm+l0tP/HFFOfJpFHiI1//LI=; b=O6x0ANLGhslVcC1+3RGJ5nFPK+fcPMRzB4O0ymz76PjW+o4CTIwgiCU1iTUE5+i24Jcz/9EJP t4n8hUyXVyuDPjjAlT9nSgVACT1jy1vPFra7fTpVnE0+xh2xq37MYQN X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz This test verifies - the sequence number checks using the maximum advertised window sequence number and - the logic for handling received data in tcp_data_queue() for the cases: 1. The window is reduced to zero because of memory 2. The window grows again but still does not reach the originally advertised window Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet --- .../net/packetdrill/tcp_rcv_wnd_shrink_nomem.pkt | 132 +++++++++++++++++= ++++ 1 file changed, 132 insertions(+) diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nom= em.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nomem.p= kt new file mode 100644 index 0000000000000000000000000000000000000000..69b060c548eac50f5dc5c034c90= f0b8eae4b4fa6 --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nomem.pkt @@ -0,0 +1,132 @@ +// SPDX-License-Identifier: GPL-2.0 +// When tcp_receive_window() < tcp_max_receive_window(), tcp_sequence() ac= cepts +// packets that would be dropped under normal conditions (i.e. tcp_receive= _window() +// equal to tcp_max_receive_window()). +// Test that such packets are handled as expected for RWIN =3D=3D 0 and fo= r RWIN > 0. + +--mss=3D1000 + +`./defaults.sh` + + 0 `nstat -n` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) =3D 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) =3D 0 + +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [1000000], 4) =3D 0 + +0 bind(3, ..., ...) =3D 0 + +0 listen(3, 1) =3D 0 + + +0 < S 0:0(0) win 32792 + +0 > S. 0:0(0) ack 1 win 65535 + +0 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) =3D 4 + +// Put 1040000 bytes into the receive buffer + +0 < P. 1:65001(65000) ack 1 win 257 + * > . 1:1(0) ack 65001 + +0 < P. 65001:130001(65000) ack 1 win 257 + * > . 1:1(0) ack 130001 + +0 < P. 130001:195001(65000) ack 1 win 257 + * > . 1:1(0) ack 195001 + +0 < P. 195001:260001(65000) ack 1 win 257 + * > . 1:1(0) ack 260001 + +0 < P. 260001:325001(65000) ack 1 win 257 + * > . 1:1(0) ack 325001 + +0 < P. 325001:390001(65000) ack 1 win 257 + * > . 1:1(0) ack 390001 + +0 < P. 390001:455001(65000) ack 1 win 257 + * > . 1:1(0) ack 455001 + +0 < P. 455001:520001(65000) ack 1 win 257 + * > . 1:1(0) ack 520001 + +0 < P. 520001:585001(65000) ack 1 win 257 + * > . 1:1(0) ack 585001 + +0 < P. 585001:650001(65000) ack 1 win 257 + * > . 1:1(0) ack 650001 + +0 < P. 650001:715001(65000) ack 1 win 257 + * > . 1:1(0) ack 715001 + +0 < P. 715001:780001(65000) ack 1 win 257 + * > . 1:1(0) ack 780001 + +0 < P. 780001:845001(65000) ack 1 win 257 + * > . 1:1(0) ack 845001 + +0 < P. 845001:910001(65000) ack 1 win 257 + * > . 1:1(0) ack 910001 + +0 < P. 910001:975001(65000) ack 1 win 257 + * > . 1:1(0) ack 975001 + +0 < P. 975001:1040001(65000) ack 1 win 257 + * > . 1:1(0) ack 1040001 + +// Trigger an extreme memory squeeze by shrinking SO_RCVBUF + +0 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [16000], 4) =3D 0 + + +0 < P. 1040001:1105001(65000) ack 1 win 257 + * > . 1:1(0) ack 1040001 win 0 +// Check LINUX_MIB_TCPRCVQDROP has been incremented + +0 `nstat -s | grep TcpExtTCPRcvQDrop| grep -q " 1 "` + +// RWIN =3D=3D 0: rcv_wup =3D 1040001, rcv_wnd =3D 0, rcv_mwnd_seq > 11050= 01 (significantly larger, typically ~1970000) + +// Accept pure ack with seq in max adv. window + +0 write(4, ..., 1000) =3D 1000 + +0 > P. 1:1001(1000) ack 1040001 win 0 + +0 < . 1105001:1105001(0) ack 1001 win 257 + +// In order segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_ZEROW= INDOW) + +0 < P. 1040001:1041001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Ooo partial segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_ZE= ROWINDOW) + +0 < P. 1039001:1041001(2000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Check LINUX_MIB_TCPZEROWINDOWDROP has been incremented twice + +0 `nstat -s | grep TcpExtTCPZeroWindowDrop| grep -q " 2 "` + +// Ooo segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_OVERWINDOW) + +0 < P. 1105001:1106001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Ooo segment, beyond max adv. window -> drop (SKB_DROP_REASON_TCP_INVALI= D_SEQUENCE) + +0 < P. 2000001:2001001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Check LINUX_MIB_BEYOND_WINDOW has been incremented twice + +0 `nstat -s | grep TcpExtBeyondWindow | grep -q " 2 "` + +// Read all data + +0 read(4, ..., 2000000) =3D 1040000 + * > . 1001:1001(0) ack 1040001 + +// RWIN > 0: rcv_wup =3D 1040001, 0 < rcv_wnd < 32000, rcv_mwnd_seq > 1105= 001 (significantly larger, typically ~1970000) + +// Accept pure ack with seq in max adv. window, beyond adv. window + +0 write(4, ..., 1000) =3D 1000 + +0 > P. 1001:2001(1000) ack 1040001 + +0 < . 1105001:1105001(0) ack 2001 win 257 + +// In order segment, in max adv. window, in adv. window -> accept +// Note: This also ensures that we cannot hit the empty queue exception in= tcp_sequence() in the following tests + +0 < P. 1040001:1041001(1000) ack 2001 win 257 + * > . 2001:2001(0) ack 1041001 + +// Ooo partial segment, in adv. window -> accept + +0 < P. 1040001:1042001(2000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 + +// Ooo segment, in max adv. window, beyond adv. window -> drop (SKB_DROP_R= EASON_TCP_OVERWINDOW) + +0 < P. 1105001:1106001(1000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 +// Ooo segment, beyond max adv. window, beyond adv. window -> drop (SKB_DR= OP_REASON_TCP_INVALID_SEQUENCE) + +0 < P. 2000001:2001001(1000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 +// Check LINUX_MIB_BEYOND_WINDOW has been incremented twice + +0 `nstat -s | grep TcpExtBeyondWindow | grep -q " 4 "` + +// We are allowed to go beyond the window and buffer with one packet + +0 < P. 1042001:1062001(20000) ack 2001 win 257 + * > . 2001:2001(0) ack 1062001 + +0 < P. 1062001:1082001(20000) ack 2001 win 257 + * > . 2001:2001(0) ack 1082001 win 0 + +// But not more: In order segment, in max adv. window -> drop (SKB_DROP_RE= ASON_TCP_ZEROWINDOW)=20 + +0 < P. 1082001:1083001(1000) ack 2001 win 257 + * > . 2001:2001(0) ack 1082001 +// Check LINUX_MIB_TCPZEROWINDOWDROP has been incremented again + +0 `nstat -s | grep TcpExtTCPZeroWindowDrop| grep -q " 3 "` --=20 2.53.0 From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9E8936B074; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; cv=none; b=QWzOP1ei5PPCrec4JbGBq0KD1Wxj2nd3PDkqeNxL67Ev6kPWsp7CoKQ6Z0w398Ai47gAe9DULMrk4KOWIwgozAiTRwSZrGjIidMLiRGoLVo6z+6vBu3Qx0aPrAMNqLO5dU3csUjOvxrLClWQOdGfZ2hnZJAVsP5++XAk5LgEdrc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043426; c=relaxed/simple; bh=wm3+FPgt/onbzQ/osNpSD8KFc26/gyxoBmdvkSWqouY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hOZpGf1HktQ56tFy5seW1FUJoQachqE7wYOd83xrk1eoTJug7Kx0DHCzWlj7EMIykFzemRNDars1X/R7j6BmhflJDjc7su1IxVPKIgZcIh54bAVi0dDKyHizi/3bsFgeoDPgduUHO7BygahDex5D2oyXlNLvnvbmsjQcg5ED7IQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ts9DMZy9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ts9DMZy9" Received: by smtp.kernel.org (Postfix) with ESMTPS id 9B412C2BCB2; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=wm3+FPgt/onbzQ/osNpSD8KFc26/gyxoBmdvkSWqouY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=ts9DMZy9wNVrpbIARzo1Yeqcfd3twYqEa09HQ8z/6Ny8+FUTZTSZ/k740vX1Brf8L U0NXv6Qq0gMGt7g5P39nDciJvLZUt5IoJClBfG7F3Z2RE3S/AG7WV9c1nzISIJ+LvZ O0w5lAD7TIgEIw3YjgSeyoxmociN2dGZto6qFScxN/utozOp6NcP5xyPALR8lxVBW4 CvJ/zY7jSD+vCaj+OwEueEJB3ezV/EyQw5lOhxKMdk4luDMNVn5igsAvTcb+277uAH 1Tf2aJgtW0Awinx/GXzaQ0VRLpPToF8G5n8yYwf+/Rk/sfqUiHDl0JdGmJqGr+XxYI Qy8ATBFUB5i3A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92632EF36FB; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:30 +0100 Subject: [PATCH net-next v3 5/6] selftests/net: packetdrill: add tcp_rcv_wnd_shrink_allowed.pkt Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-5-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=1920; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=arhqPRPG3bbk8HOp2tIbu0W9lKz7uQEcs1MV+6YcVmE=; b=6U+j8jI9mgxnmmfYsnaaiZNjL8QRLggJ1951sCepfaA2tkHuxov6XCT4OHnCKQWsWXDnqejCy +lRkxVzNpKNCPod8a/Qokehmuhh9tjMcUe9QmnSmEI2sAX04fEQMQU8 X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz This test verifies the sequence number checks using the maximum advertised window sequence number when net.ipv4.tcp_shrink_window is enabled. Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet --- .../net/packetdrill/tcp_rcv_wnd_shrink_allowed.pkt | 40 ++++++++++++++++++= ++++ 1 file changed, 40 insertions(+) diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_all= owed.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_allow= ed.pkt new file mode 100644 index 0000000000000000000000000000000000000000..6af0e0eb183a0d2fa474c304d96= 9ce6ddeb2a1e1 --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_allowed.pkt @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 + +--mss=3D1000 + +`./defaults.sh +sysctl -q net.ipv4.tcp_shrink_window=3D1 +sysctl -q net.ipv4.tcp_rmem=3D"4096 32768 $((32*1024*1024))"` + + 0 `nstat -n` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) =3D 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) =3D 0 + +0 bind(3, ..., ...) =3D 0 + +0 listen(3, 1) =3D 0 + + +0 < S 0:0(0) win 32792 + +0 > S. 0:0(0) ack 1 + +0 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) =3D 4 + + +0 < P. 1:10001(10000) ack 1 win 257 + * > . 1:1(0) ack 10001 win 15 + + +0 < P. 10001:11024(1023) ack 1 win 257 + * > . 1:1(0) ack 11024 win 13 + +// Max window seq advertised 10001 + 15*1024 =3D 25361, last advertised: 1= 1024 + 13*1024 =3D 24336 + +// Segment beyond the max window is dropped + +0 < P. 11024:25362(14338) ack 1 win 257 + * > . 1:1(0) ack 11024 win 13 + +// Segment using the max window is accepted + +0 < P. 11024:25361(14337) ack 1 win 257 + * > . 1:1(0) ack 25361 win 0 + +// Check LINUX_MIB_BEYOND_WINDOW has been incremented once + +0 `nstat | grep TcpExtBeyondWindow | grep -q " 1 "` --=20 2.53.0 From nobody Sun Mar 22 08:30:24 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB66536B075; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043427; cv=none; b=QcnTcbRjuLLr8A2Id4szHFnDwORwcLoLC2zNq0BWOfaeY88n9P2QGZ7w70AuCIkGA7JkqzE4yx/y2cJiYwTRoZX1/cD23RyfaMJ8d/d8mkD4YSklxJBUCoXbMuAutMy9b8NUBjB7GpG4dziOhjzncGrGW8J0zS/QsUhOUvCDPo8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773043427; c=relaxed/simple; bh=emTIU1TaBbIx/tDTZhzmiIjXQ7JpqWXdQa8oeCKbuEQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=aHk5yrEehq+l1hFrEqVteW1YDhvggGdj1JV2kxClgxkhrrQQ+Uy0bxUHBnK+EnOkgXT4yM3WTskgYJR/eU5/ZQJ9Bc8M0sHwC/ogOnN67mW1+2RZuHCweEzJ7JjQw3PgIYoyZGde0kOcxG2FWhY+z+e4Iu9Cm1vNV2MyafW7fGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oHylr5Sr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oHylr5Sr" Received: by smtp.kernel.org (Postfix) with ESMTPS id AB3B9C2BCB6; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773043426; bh=emTIU1TaBbIx/tDTZhzmiIjXQ7JpqWXdQa8oeCKbuEQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=oHylr5SrJEEJoOjUUMs9zV6AckkonQRN0XrGTbbHE83Da7QroQPZaES3D+Z5aZ2N8 3usNEzeJuI7aoNsGkKob0GnInh4jvA4pkGJjMA6y+NKBBeYvZOD5Pg6AZEnNnGmPKK XSdEFvXZ0lsm0hV7vxWkfXUvVfqCGEGio3Nla9LhPhbINCYNbLpKRVD/zHwCQoiWEA 5+iuEjOaHRvI2u1Zd4zlyDIERUdoupdeBJX1uCOPDa0HfU1Q+Wr1Vs9Brtb+zewo4b nQIWLYM4duf1ofqydrzmjchMYOh7qxgWJMi6A7iritfagvFJ/kNSROhrJmMETEEI8T uoqHbwT2velXw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1F02EF3709; Mon, 9 Mar 2026 08:03:46 +0000 (UTC) From: Simon Baatz via B4 Relay Date: Mon, 09 Mar 2026 09:02:31 +0100 Subject: [PATCH net-next v3 6/6] selftests/net: packetdrill: add tcp_rcv_neg_window.pkt Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-6-4c7f96b1ec69@gmail.com> References: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> In-Reply-To: <20260309-tcp_rfc7323_retract_wnd_rfc-v3-0-4c7f96b1ec69@gmail.com> To: Eric Dumazet , Neal Cardwell , Kuniyuki Iwashima , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , David Ahern , Jon Maloy , Jason Xing , mfreemon@cloudflare.com, Shuah Khan , Stefano Brivio , Matthieu Baerts , Mat Martineau , Geliang Tang Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, mptcp@lists.linux.dev, Simon Baatz X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773043425; l=1560; i=gmbnomis@gmail.com; s=20260220; h=from:subject:message-id; bh=/JWWw4Fx6ONTjfm9hTqMNguk1B74iLUSkQSY5L1syFw=; b=2D7v4mt8UrAsF6NRbarJgDDGMJrIAAIk6UGHgHlbR6lGUYUnBAQbYbj1gZaxxGqnn6xPfh8YQ y5j57Qjr1c9BLHK8/b0/n69p50Qeg1agzXfds2tVH6K9S9xPkL77w9u X-Developer-Key: i=gmbnomis@gmail.com; a=ed25519; pk=T/JIz/6F5bf1uQJr69lmyi7czVG+F9TVZ/8x5z9Wtqw= X-Endpoint-Received: by B4 Relay for gmbnomis@gmail.com/20260220 with auth_id=641 X-Original-From: Simon Baatz Reply-To: gmbnomis@gmail.com From: Simon Baatz The test ensures we correctly apply the maximum advertised window limit when rcv_nxt advances past rcv_mwnd_seq, so that the "usable window" is properly clamped to zero rather than becoming negative. Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet --- .../net/packetdrill/tcp_rcv_neg_window.pkt | 26 ++++++++++++++++++= ++++ 1 file changed, 26 insertions(+) diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt= b/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt new file mode 100644 index 0000000000000000000000000000000000000000..15a9b4938f16d175ac54f3fd192= ed2b59b0a4399 --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 + +--mss=3D1000 + +`./defaults.sh` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) =3D 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) =3D 0 + +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [20000], 4) =3D 0 + +0 bind(3, ..., ...) =3D 0 + +0 listen(3, 1) =3D 0 + + +0 < S 0:0(0) win 32792 + +0 > S. 0:0(0) ack 1 win 18980 + +.1 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) =3D 4 + +// A too big packet is accepted if the receive queue is empty + +0 < P. 1:20001(20000) ack 1 win 257 +// Send a RST immediately so that there is no rcv_wup/rcv_mwnd_seq update = yet + +0 < R. 20001:20001(0) ack 1 win 257 + + +.1 %{ assert tcpi_state =3D=3D TCP_CLOSE, tcpi_state }% + --=20 2.53.0