From nobody Wed Jun 17 06:04:46 2026 Received: from mail-oo1-f97.google.com (mail-oo1-f97.google.com [209.85.161.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B67D230C62E for ; Mon, 27 Apr 2026 15:31:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.97 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777303907; cv=none; b=idrEQ9yAUyhONKrfivwd0TsuKnOKFLTJlJEosQBdzGv1Ct+AgKGKkrlzNojZIZERRLHwNjBsJ7jnycIO1ha7fFmVU6as3wXGXvlNRca/1a2ZBZHfv2LeKrkMQ1y6dMmvGw5Q1fwT+ef5U69ymtV9tpKy1zCpsICeQe/BMKKId9k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777303907; c=relaxed/simple; bh=f0Wiy+TGct6uE/9Igrbb6fvX2X++EbFqcLIALxPFAYQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=YCD+TdywCWmgXHi2TvEMIffY9bpsEnSjYKmSx2Z3QN3UZ4IMc8wVDmtdm0sw0PIuovGaBSehPUrYwkuGtDiM4Mm1R1rKBiZwuhTFTwdL5ovtZt2JsHtkxxneNpsLEH/bRCbdJPf7mHbwHcE3l+55FE4agOwl/PUFQL6CRyaCQ7k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com; spf=fail smtp.mailfrom=broadcom.com; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b=FZpTr2Kx; arc=none smtp.client-ip=209.85.161.97 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=broadcom.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="FZpTr2Kx" Received: by mail-oo1-f97.google.com with SMTP id 006d021491bc7-69489e55b7aso531382eaf.0 for ; Mon, 27 Apr 2026 08:31:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777303905; x=1777908705; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dkim-signature:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vik7X9JGccLkKKz3/pSkLmm87Qrp4z5SxtaHa4HWROw=; b=dZhtnw8BGClnBY8+qpMD5FbIYcCJktkA4vexV1Z7c/z1mdOtKPVagNATfqvH7cQ0iv 8Uy4HC9JlE2a0/RF00jCNhkMZPZIMZgzG5DCovJ0oTnPBq+upCpOyZQfEtKIIq6IRNnm YtySdRAeGeCQsJNA6fQKkN5TxoKwG4bQzM/u/i54CGa06BfsnMzLy3mfQJCqUkAIyHfE rhdQjhY0G1uMN1dCnlhBQury91NK8eSXhHUIJq4lladgo+hLZh1AeCroewH1CsEV1Qn0 xbb3s04I9bdNgvsqxo6anJmh5X99G5N4x9KBiLmJnG0dLpkKTALx0x4L6WsfgXiLKKtU S3ow== X-Gm-Message-State: AOJu0Yzt721OlBbsoF/YaLWAT7nw+7yhsbhByY0wzgqMbTcRxoDwYfcn nwV9GhMISDUyPuw0Lp2aQYkyLjSpTE54DG7R6vTrUfQcvkTORncHnPxbNb9FYci9HZcXQgdNr7I oLqfhZAiu+fqpgtFkpPctEnbtmanW+TGX/j1LCfczIgUDCgfKAscaaf6ImamRaJNWgDn8WKTvQR hF3SpBVm7sAtTa/66lPrkSFDrxys0hiarNxkMAuJW//3+0/Dljr05h9bUdAkIoLKEhWIbuPCusF ByJzUYeIsga5WclwkaeP04= X-Gm-Gg: AeBDietLNuc4Qmnf8w0LZGIcZrz2Urz0BlA05ztX9BOdEl+lpGpBpi6MKsEXGyiKxw8 Kp6Owt0h6VXEZD5F5IAPSW4VO3gscELfJEpqkHKJTVrMyt5LT3jbSNxdUzTRccpGnmnsZxtKJun 79iUbbqIM7epv1Z/6Ut8Wy2ZlyxDICZosiCEnBDRJzMXkkkkiYhXv1vVpuYORQaVfRlTJhHV2Ou qb3SkUnPrTNZmLFci3vCb5jpX+KyoJMdy7tm+EXQtacaDwDuqHWjuATzVfgoRZhSHgamIkXtPlX voUhj7Ko73wjaGeF/axVTktEmEUFa2CIE5VoXxmae1XkmosH6vFO5KFe/31aHbTbHEkNPb3HOrp lF+XgHGV3CMSoc6Zy9mEpdY/l7xqMWKwUE/xpKIeGrpkCcEoSrI8eoK0W2JUtt4znhb0Pq2aGTa a05FJUCOzV6woNMnYUvp0mcVq3uCX+tmuPE0hQFNihwGL8dTjVt1l9imQWoZz2vmuskv8R X-Received: by 2002:a05:6820:330e:b0:696:1d76:9dc with SMTP id 006d021491bc7-6961d76123amr4263515eaf.3.1777303904517; Mon, 27 Apr 2026 08:31:44 -0700 (PDT) Received: from smtp-us-east1-p01-i01-si01.dlp.protect.broadcom.com (address-144-49-247-16.dlp.protect.broadcom.com. [144.49.247.16]) by smtp-relay.gmail.com with ESMTPS id 006d021491bc7-694a22fb501sm960140eaf.3.2026.04.27.08.31.43 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Apr 2026 08:31:44 -0700 (PDT) X-Relaying-Domain: broadcom.com X-CFilter-Loop: Reflected Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-7bb0e08d457so16386527b3.1 for ; Mon, 27 Apr 2026 08:31:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1777303903; x=1777908703; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Vik7X9JGccLkKKz3/pSkLmm87Qrp4z5SxtaHa4HWROw=; b=FZpTr2KxurxE7WVv/B4QY5ybNsJZ6LkBcRmnZAc0H4ijEHj44t53+W0q5y6dh1IDfB pgSlBSffIXsML6bTFXrupJULteN1FXOkhVapzbml8clERPA4BPdTQGq5YGTy40cnB0IV U/2zq/iYdkIoEnhRUqLNJuBFCIod4A7JABE68= X-Received: by 2002:a05:690c:6c91:b0:79e:631e:67b with SMTP id 00721157ae682-7b9ecfc22a0mr275661337b3.4.1777303902365; Mon, 27 Apr 2026 08:31:42 -0700 (PDT) X-Received: by 2002:a05:690c:6c91:b0:79e:631e:67b with SMTP id 00721157ae682-7b9ecfc22a0mr275660897b3.4.1777303901665; Mon, 27 Apr 2026 08:31:41 -0700 (PDT) Received: from photon-d7fac424c0d3 ([192.19.161.250]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8e7d69abee3sm2590486585a.17.2026.04.27.08.31.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Apr 2026 08:31:41 -0700 (PDT) From: Ankit Jain To: netdev@vger.kernel.org, davem@davemloft.net, dsahern@kernel.org, edumazet@google.com, ncardwell@google.com, kuniyu@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, quic_stranche@quicinc.com, quic_subashab@quicinc.com Cc: linux-kernel@vger.kernel.org, karen.badiryan@broadcom.com, ajay.kaher@broadcom.com, alexey.makhalov@broadcom.com, vamsi-krishna.brahmajosyula@broadcom.com, yin.ding@broadcom.com, tapas.kundu@broadcom.com, Ankit Jain , stable@vger.kernel.org Subject: [PATCH net] tcp: do not shrink window clamp when SO_RCVBUF is locked Date: Mon, 27 Apr 2026 15:27:55 +0000 Message-ID: <20260427152756.1205-1-ankit-aj.jain@broadcom.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-DetectorID-Processed: b00c1d49-9d2e-4205-b15f-d015386d3d5e Content-Type: text/plain; charset="utf-8" When an application explicitly sets SO_RCVBUF, the window clamp should not be dynamically recalculated based on the memory scaling_ratio. Currently, tcp_measure_rcv_mss() aggressively crushes the window clamp down when it sees a poor skb->len to skb->truesize ratio. If the application explicitly locked the buffer via SO_RCVBUF, this recalculation causes the advertised window to drop severely. If the window drops below the interface MSS, it triggers Silly Window Syndrome (SWS) avoidance on the sender. The sender defers transmission and drops the connection into a perpetual 200ms PROBE0 timer loop, drastically reducing throughput. This is highly reproducible on loopback interfaces (MTU 65536) using Java-based workloads (like Tomcat/GemFire) where the JVM sets SO_RCVBUF to 32K or 64K. The bloated loopback truesize forces the scaling ratio to drop, crushing the window clamp to ~26K, instantly triggering SWS stalls and causing gigabyte transfers to take minutes instead of milliseconds. Since the application locked the buffer, the kernel should respect the clamp boundary and not dynamically crush it based on runtime ratios. Fixes: a2cbb1603943 ("tcp: Update window clamping condition") Cc: stable@vger.kernel.org Reported-by: Karen Badiryan Signed-off-by: Ankit Jain --- Note to reviewers: Testing Context: - The SWS deadlock was successfully reproduced on the latest netdev/net=20 tree (v7.1-rc1) using the actual enterprise Java workload. - Applying this patch completely resolves the 504 Timeouts and restores=20 loopback throughput. - Baseline iperf3 auto-tuning remains unaffected by this patch. For context, here is the exact sequence of events that triggers the=20 recalculation flaw, illustrated in a packetdrill-style flow.=20 Unpatched kernels aggressively crush the window at step 3, triggering SWS. // 1. Tomcat creates socket and hardcodes the buffer to 32K 0 socket(..., SOCK_STREAM, IPPROTO_TCP) =3D 3 +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [32768]) =3D 0 +0 bind(3, ..., ...) =3D 0 +0 listen(3, 1) =3D 0 // 2. GemFire connects over loopback (simulating Jumbo MSS of 65496) +0 < S 0:0(0) win 65535 +0 > S. 0:0(0) ack 1 <...> +0 < . 1:1(0) ack 1 win 65535 +0 accept(3, ..., ...) =3D 4 // 3. GemFire sends a 20KB packet, dropping the scaling_ratio. // Without the patch, tcp_measure_rcv_mss() crushes the window_clamp here. +0.1 < . 1:20001(20000) ack 1 win 65535 +0.1 read(4, ..., 20000) =3D 20000 // 4. Assert window did not crush // WITH the patch, the kernel respects the SOCK_RCVBUF_LOCK. +0 > . 1:1(0) ack 20001 win 65535 --- net/ipv4/tcp_input.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index d5c9e65d9..c1cb9d3ed 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -248,7 +248,8 @@ static void tcp_measure_rcv_mss(struct sock *sk, const = struct sk_buff *skb) do_div(val, skb->truesize); tcp_sk(sk)->scaling_ratio =3D val ? val : 1; =20 - if (old_ratio !=3D tcp_sk(sk)->scaling_ratio) { + if (old_ratio !=3D tcp_sk(sk)->scaling_ratio && + !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { struct tcp_sock *tp =3D tcp_sk(sk); =20 val =3D tcp_win_from_space(sk, sk->sk_rcvbuf); --=20 2.53.0