[PATCH net-next] tls: check return value of strp_load_anchor_with_queue

Geliang Tang posted 1 patch 2 weeks, 1 day ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/ce74452f4c095a1761ef493b767b4bd9f9c14359.1764333805.git.tanggeliang@kylinos.cn
net/tls/tls_strp.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
[PATCH net-next] tls: check return value of strp_load_anchor_with_queue
Posted by Geliang Tang 2 weeks, 1 day ago
From: Geliang Tang <tanggeliang@kylinos.cn>

In tls_strp_load_anchor_with_queue(), when first is null, strp->anchor is
not successfully initialized. Accessing strp->anchor afterward will result
in a memory access error (for example, BUG: KASAN: slab-use-after-free in
skb_copy_bits).

This patch adds a bool return value to tls_strp_load_anchor_with_queue()
to indicate whether initialization was successful. It also adds checks for
this return value in both tls_strp_msg_load() and tls_strp_read_sock().
If the initialization fails, the functions will return immediately,
preventing invalid memory access.

Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
I encountered the following error while developing MPTCP TLS support [1][2]:

'''
[ 2054.086130][T15666] ==================================================================
[ 2054.086450][T15666] BUG: KASAN: slab-use-after-free in skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086763][T15666] Read of size 4 at addr ffff8881224c9550 by task test_progs-cpuv/15666
[ 2054.086871][T15666] 
[ 2054.086912][T15666] CPU: 18 UID: 0 PID: 15666 Comm: test_progs-cpuv Tainted: G           OE       6.18.0-rc6+ #36 PREEMPT(full) 
[ 2054.086915][T15666] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 2054.086916][T15666] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 2054.086917][T15666] Call Trace:
[ 2054.086919][T15666]  <TASK>
[ 2054.086920][T15666]  ? skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086922][T15666]  dump_stack_lvl (lib/dump_stack.c:122)
[ 2054.086927][T15666]  print_address_description.constprop.0 (mm/kasan/report.c:379)
[ 2054.086931][T15666]  ? skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086932][T15666]  print_report (mm/kasan/report.c:483)
[ 2054.086933][T15666]  ? __virt_addr_valid (include/linux/rcupdate.h:981 (discriminator 3) include/linux/mmzone.h:2197 (discriminator 3) arch/x86/mm/physaddr.c:65 (discriminator 3))
[ 2054.086937][T15666]  ? skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086939][T15666]  kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:597)
[ 2054.086943][T15666]  ? skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086946][T15666]  skb_copy_bits (net/core/skbuff.c:3039)
[ 2054.086948][T15666]  ? __lock_release.isra.0 (kernel/locking/lockdep.c:5536)
[ 2054.086950][T15666]  ? trace_lock_acquire (include/trace/events/lock.h:24 (discriminator 33))
[ 2054.086953][T15666]  ? mark_held_locks (kernel/locking/lockdep.c:4325 (discriminator 1))
[ 2054.086955][T15666]  tls_rx_msg_size (net/tls/tls_sw.c:2464)
[ 2054.086960][T15666]  ? tls_strp_load_anchor_with_queue (net/tls/tls_strp.c:472 (discriminator 1))
[ 2054.086962][T15666]  ? __pfx_tls_rx_msg_size (net/tls/tls_sw.c:2444)
[ 2054.086965][T15666]  tls_strp_check_rcv (net/tls/tls_strp.c:538 net/tls/tls_strp.c:562)
[ 2054.086967][T15666]  ? __pfx_mptcp_read_done (net/mptcp/protocol.c:4486)
[ 2054.086971][T15666]  ? __pfx_tls_strp_check_rcv (net/tls/tls_strp.c:558)
[ 2054.086972][T15666]  ? __pfx_tls_rx_one_record (net/tls/tls_sw.c:1803)
[ 2054.086975][T15666]  ? __asan_memset (mm/kasan/shadow.c:84 (discriminator 2))
[ 2054.086977][T15666]  ? tls_strp_msg_done (net/tls/tls_strp.c:608)
[ 2054.086979][T15666]  tls_sw_recvmsg (net/tls/tls_sw.c:2160)
[ 2054.086982][T15666]  ? __lock_acquire (kernel/locking/lockdep.c:5237)
[ 2054.086985][T15666]  ? __pfx_tls_sw_recvmsg (net/tls/tls_sw.c:2038)
[ 2054.086988][T15666]  ? lock_acquire.part.0 (kernel/locking/lockdep.c:470 kernel/locking/lockdep.c:5870)
[ 2054.086989][T15666]  ? find_held_lock (kernel/locking/lockdep.c:5350 (discriminator 1))
[ 2054.086992][T15666]  inet_recvmsg (net/ipv4/af_inet.c:891 (discriminator 7))
[ 2054.086994][T15666]  ? __fget_files (include/linux/rcupdate.h:341 (discriminator 1) include/linux/rcupdate.h:897 (discriminator 1) fs/file.c:1072 (discriminator 1))
[ 2054.086997][T15666]  ? __pfx_inet_recvmsg (net/ipv4/af_inet.c:883)
[ 2054.087000][T15666]  ? security_socket_recvmsg (security/security.c:4774 (discriminator 15))
[ 2054.087003][T15666]  sock_recvmsg (net/socket.c:1078 (discriminator 15) net/socket.c:1100 (discriminator 15))
[ 2054.087006][T15666]  __sys_recvfrom (net/socket.c:2296)
[ 2054.087008][T15666]  ? __pfx___sys_recvfrom (net/socket.c:2271)
[ 2054.087012][T15666]  ? trace_rseq_update (include/trace/events/rseq.h:11 (discriminator 33))
[ 2054.087016][T15666]  ? xfd_validate_state (arch/x86/kernel/fpu/xstate.c:1499 arch/x86/kernel/fpu/xstate.c:1543)
[ 2054.087019][T15666]  __x64_sys_recvfrom (net/socket.c:2309 (discriminator 1) net/socket.c:2305 (discriminator 1) net/socket.c:2305 (discriminator 1))
[ 2054.087021][T15666]  ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4473)
[ 2054.087024][T15666]  ? do_syscall_64 (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:119 include/linux/entry-common.h:124 arch/x86/entry/syscall_64.c:90)
[ 2054.087027][T15666]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[ 2054.087029][T15666]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 2054.087031][T15666] RIP: 0033:0x7f309edf1772
'''
This patch can fix it.

[1]
https://github.com/multipath-tcp/mptcp_net-next/issues/480
[2]
https://patchwork.kernel.org/project/mptcp/cover/cover.1763800601.git.tanggeliang@kylinos.cn/
---
 net/tls/tls_strp.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c
index 98e12f0ff57e..48bd64f103ec 100644
--- a/net/tls/tls_strp.c
+++ b/net/tls/tls_strp.c
@@ -458,7 +458,7 @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp)
 	return true;
 }
 
-static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len)
+static bool tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len)
 {
 	struct tcp_sock *tp = tcp_sk(strp->sk);
 	struct sk_buff *first;
@@ -466,7 +466,7 @@ static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len)
 
 	first = tcp_recv_skb(strp->sk, tp->copied_seq, &offset);
 	if (WARN_ON_ONCE(!first))
-		return;
+		return false;
 
 	/* Bestow the state onto the anchor */
 	strp->anchor->len = offset + len;
@@ -479,6 +479,7 @@ static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len)
 	strp->anchor->destructor = NULL;
 
 	strp->stm.offset = offset;
+	return true;
 }
 
 bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh)
@@ -496,7 +497,8 @@ bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh)
 			return false;
 		}
 
-		tls_strp_load_anchor_with_queue(strp, strp->stm.full_len);
+		if (!tls_strp_load_anchor_with_queue(strp, strp->stm.full_len))
+			return false;
 	}
 
 	rxm = strp_msg(strp->anchor);
@@ -523,7 +525,8 @@ static int tls_strp_read_sock(struct tls_strparser *strp)
 	if (inq < strp->stm.full_len)
 		return tls_strp_read_copy(strp, true);
 
-	tls_strp_load_anchor_with_queue(strp, inq);
+	if (!tls_strp_load_anchor_with_queue(strp, inq))
+		return 0;
 	if (!strp->stm.full_len) {
 		sz = tls_rx_msg_size(strp, strp->anchor);
 		if (sz < 0)
-- 
2.51.0
Re: [PATCH net-next] tls: check return value of strp_load_anchor_with_queue
Posted by Paolo Abeni 2 weeks, 1 day ago
On 11/28/25 1:55 PM, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> In tls_strp_load_anchor_with_queue(), when first is null, strp->anchor is
> not successfully initialized. Accessing strp->anchor afterward will result
> in a memory access error (for example, BUG: KASAN: slab-use-after-free in
> skb_copy_bits).

tls_strp_load_anchor_with_queue() has:

	WARN_ON_ONCE(!first)

and AFAICS all the tls_strp_load_anchor_with_queue() call sites ensure
that the receive queue is not empty before invoking such function.

Hitting the above condition is a symtom of a prior issue that must be
identified and fixed. Please try to solve such problem instead.

/P
Re: [PATCH net-next] tls: check return value of strp_load_anchor_with_queue
Posted by Geliang Tang 1 week, 4 days ago
Hi Paolo,

mptcp-list only.

On Fri, 2025-11-28 at 16:01 +0100, Paolo Abeni wrote:
> On 11/28/25 1:55 PM, Geliang Tang wrote:
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> > 
> > In tls_strp_load_anchor_with_queue(), when first is null, strp-
> > >anchor is
> > not successfully initialized. Accessing strp->anchor afterward will
> > result
> > in a memory access error (for example, BUG: KASAN: slab-use-after-
> > free in
> > skb_copy_bits).
> 
> tls_strp_load_anchor_with_queue() has:
> 
> 	WARN_ON_ONCE(!first)

I printed sk_state of strp->sk when first was null, and found it to be
8, which corresponds to TCP_CLOSE_WAIT.

Following further debugging, it was found that the issue lies in
tls_strp_read_sock() in [1] ("tls: add MPTCP protocol support"), where
I used mptcp_inq_hint() to replace tcp_inq() as follows:

-	inq = tcp_inq(strp->sk);
+	inq = strp->sk->sk_protocol == IPPROTO_MPTCP ?
+	      mptcp_inq_hint(strp->sk) :
+	      tcp_inq(strp->sk);
 	if (inq < 1)
 		return 0;

However, mptcp_inq_hint() returns 1 when sk->sk_state == TCP_CLOSE or
sk->sk_shutdown & RCV_SHUTDOWN, which is inconsistent with the behavior
of tcp_inq().

I can modify the condition here from 'if (inq < 1)' to 'if (inq <= 1)'
to fix the issue, but I know this isn't a good solution. I need to
reimplement an mptcp_inq() helper instead of directly using
mptcp_inq_hint(), right? Could you give me some advice?

Thanks,
-Geliang

[1]
https://patchwork.kernel.org/project/mptcp/patch/2a0f438fbd4c5ddf7d8153bfc8aa44cfefa58c45.1763800601.git.tanggeliang@kylinos.cn/

> 
> and AFAICS all the tls_strp_load_anchor_with_queue() call sites
> ensure
> that the receive queue is not empty before invoking such function.
> 
> Hitting the above condition is a symtom of a prior issue that must be
> identified and fixed. Please try to solve such problem instead.




> 
> /P
Re: [PATCH net-next] tls: check return value of strp_load_anchor_with_queue
Posted by MPTCP CI 2 weeks, 1 day ago
Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Unstable: 1 failed test(s): bpftest_test_progs-cpuv4_mptcp 🔴
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19764761036

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/2ddcd4f5ee35
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1028659


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)