From nobody Thu Apr 9 12:06:19 2026 Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com [95.215.58.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E911D259CA9 for ; Fri, 27 Feb 2026 06:33:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772173982; cv=none; b=jhEJ3yCf1zJ+seecNQHdxGEdXnymmU6MIzUcdtuzKSG8g+o1A5NBBtp5fOq6ONbo5GKPsqHlSx5Us94l06BHtSMIen4CAIntps+ftZJDKR1l3vbNqU1B2i/d/DQoXjriSMI29V3by9KrVyTIsSuR0dwNAZwQ8ujhiiQucTTrDoI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772173982; c=relaxed/simple; bh=wZ5N7uv8KPTHpoGDDc4HKy8x/Zs2klO0Z9yzwZp9k5w=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=Lu1xqPAK+ITi5A4XMjpC+6f2/vmHcagmVPaQNenkuQkcBdOABw/I3/yo9/PS11z+lJMETSz9wcSDCMlttOuMfORKo4oh9oMJX7zsBePLELW2VN1RXJba3OkoafUKDGcOeLOHon2mWEm8Mj/L+wJ7xhjCvcP0NeDZjtG+kyZEyVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=t9poLRPq; arc=none smtp.client-ip=95.215.58.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="t9poLRPq" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1772173979; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=nTFptO84ZPVQV80rRXeEqqT9kuYpp3EHXOHWrz8HsgI=; b=t9poLRPqjCm97eRBzapRqA4QedwjE+k+pyg74y9xtfsy6E4rg4fRyPznUXdvOCjvoM9fcU GGTQ0fvhjn9YRiVp6Kj7+puXAixUdadmr8PXnTISpH/drM5ovbLdc2/WEHeRuS7OX2Rkfo q1RQ8P3UIjpdr95RswlxGFcAX8Zi9Cc= From: Jiayuan Chen To: netdev@vger.kernel.org Cc: jiayuan.chen@linux.dev, Jiayuan Chen , syzbot+ca1345cca66556f3d79b@syzkaller.appspotmail.com, John Fastabend , Jakub Kicinski , Sabrina Dubroca , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , Vakul Garg , linux-kernel@vger.kernel.org Subject: [PATCH net v1] tls: fix hung task in tx_work_handler by using non-blocking sends Date: Fri, 27 Feb 2026 14:32:31 +0800 Message-ID: <20260227063231.168520-1-jiayuan.chen@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Jiayuan Chen tx_work_handler calls tls_tx_records with flags=3D-1, which preserves each record's original tx_flags but results in tcp_sendmsg_locked using an infinite send timeout. When the peer is unresponsive and the send buffer is full, tcp_sendmsg_locked blocks indefinitely in sk_stream_wait_memory. This causes tls_sk_proto_close to hang in cancel_delayed_work_sync waiting for tx_work_handler to finish, leading to a hung task: INFO: task ...: blocked for more than ... seconds. Call Trace: cancel_delayed_work_sync tls_sw_cancel_work_tx tls_sk_proto_close A workqueue handler should never block indefinitely. Fix this by introducing __tls_tx_records() with an extra_flags parameter that gets OR'd into each record's tx_flags. tx_work_handler uses this to pass MSG_DONTWAIT so tcp_sendmsg_locked returns -EAGAIN immediately when the send buffer is full, without overwriting the original per-record flags (MSG_MORE, MSG_NOSIGNAL, etc.). On -EAGAIN, the existing reschedule mechanism retries after a short delay. Also consolidate the two identical reschedule paths (lock contention and -EAGAIN) into one. Reported-by: syzbot+ca1345cca66556f3d79b@syzkaller.appspotmail.com Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records = for performance") Signed-off-by: Jiayuan Chen --- net/tls/tls_sw.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 9937d4c810f2..c9d3d44581da 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -404,7 +404,7 @@ static void tls_free_open_rec(struct sock *sk) } } =20 -int tls_tx_records(struct sock *sk, int flags) +static int __tls_tx_records(struct sock *sk, int flags, int extra_flags) { struct tls_context *tls_ctx =3D tls_get_ctx(sk); struct tls_sw_context_tx *ctx =3D tls_sw_ctx_tx(tls_ctx); @@ -417,9 +417,9 @@ int tls_tx_records(struct sock *sk, int flags) struct tls_rec, list); =20 if (flags =3D=3D -1) - tx_flags =3D rec->tx_flags; + tx_flags =3D rec->tx_flags | extra_flags; else - tx_flags =3D flags; + tx_flags =3D flags | extra_flags; =20 rc =3D tls_push_partial_record(sk, tls_ctx, tx_flags); if (rc) @@ -463,6 +463,11 @@ int tls_tx_records(struct sock *sk, int flags) return rc; } =20 +int tls_tx_records(struct sock *sk, int flags) +{ + return __tls_tx_records(sk, flags, 0); +} + static void tls_encrypt_done(void *data, int err) { struct tls_sw_context_tx *ctx; @@ -2629,6 +2634,7 @@ static void tx_work_handler(struct work_struct *work) struct sock *sk =3D tx_work->sk; struct tls_context *tls_ctx =3D tls_get_ctx(sk); struct tls_sw_context_tx *ctx; + int rc; =20 if (unlikely(!tls_ctx)) return; @@ -2642,16 +2648,21 @@ static void tx_work_handler(struct work_struct *wor= k) =20 if (mutex_trylock(&tls_ctx->tx_lock)) { lock_sock(sk); - tls_tx_records(sk, -1); + rc =3D __tls_tx_records(sk, -1, MSG_DONTWAIT); release_sock(sk); mutex_unlock(&tls_ctx->tx_lock); - } else if (!test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { - /* Someone is holding the tx_lock, they will likely run Tx - * and cancel the work on their way out of the lock section. - * Schedule a long delay just in case. - */ - schedule_delayed_work(&ctx->tx_work.work, msecs_to_jiffies(10)); + if (rc !=3D -EAGAIN) + return; } + + /* Someone is holding the tx_lock, they will likely run Tx + * and cancel the work on their way out of the lock section. + * Schedule a long delay just in case. + * Also reschedule on -EAGAIN when the send buffer is full + * to avoid blocking the workqueue indefinitely. + */ + if (!test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) + schedule_delayed_work(&ctx->tx_work.work, msecs_to_jiffies(10)); } =20 static bool tls_is_tx_ready(struct tls_sw_context_tx *ctx) --=20 2.43.0