From nobody Mon May 25 00:08:42 2026 Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A331136C9D5 for ; Wed, 20 May 2026 08:42:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779266565; cv=none; b=GejOmlnk7Gz6Cx2pNnXF83owcP3fjRWuKQH+KWSQ8MQJtTiiLitOxVsX9MFgtT5MX/NG8d1KLijpADqPlnZ/WdOm7MYSnROC9u7Z5WUlQ0i8f0CimRyRnis5Ob+kYY/oMAVx1wfhDiWWukBdwYkVyGmZv028rsZs8pv2ZIITM2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779266565; c=relaxed/simple; bh=tHiytlwYL5ZPS+5bOp5RZBz2MEVQjQT02NVce6+vGbs=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=avIu+CW71eEL+oZ+VmhlHwmjKnA9z4DfnZacbYG7xiBwX/RwyAEFw6dma3iAjBAdnsGmXob6ZIIbSVIBC7qZ7zw+VdhTKNcCd2vgKwB+aapePgz47K5iE06RVkZEUSAcjhX9XUbf6D5VKAtrk/pOIHPmVRg6G+sx1AQQ4U9Pni0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=S5b2Pi4L; arc=none smtp.client-ip=209.85.215.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="S5b2Pi4L" Received: by mail-pg1-f178.google.com with SMTP id 41be03b00d2f7-c8025aecc40so2247249a12.0 for ; Wed, 20 May 2026 01:42:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779266563; x=1779871363; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Cj9dNWpwR7KR9FuTwwa9cr68HYXxFlOPEg+1G0DdtjA=; b=S5b2Pi4Ld5d9QRWiWPuN9GJZ7YN+I5QiaGUaLYE0uGvNzLZCMExQgL5t6JfDf8JEMa nfYDfOvIYAezWXrd9otZWdm/ksUPBslWcsqzPqanACHPJBAZRYMwgH4Mpl9CCKIR9K5G MWyEHGfz+JrYziHB9wAzY/VgwCVVS+RMRBc9lCX1XZ+cCgXNfMtUZkeMNrwH4Va07lem EfrSOLJSJUo1umDuaJ17lUzlgDbqMp0HQ9+NZwhA50asBJ40+41RN6YtIYtnMT4n2Wj5 b+HMXeWrmZB1Us2T1qG2b/cgLouHir4hpCgdP6B7oQGMHXuGFkRYRZ0Qm5dI16B3oIaU 9zag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779266563; x=1779871363; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Cj9dNWpwR7KR9FuTwwa9cr68HYXxFlOPEg+1G0DdtjA=; b=r299Z6gQUlqTHdF8gg/WwXZq5wJVQIBwah9GjdbF+rZU1EYqTyoBedmrzzuEcaAl36 U5SHV/8Sb3KMicKlk2QQJQNQzGShCGi4vkEQU0qb6CoKsqJFTytN13W2N0yKJ4nLnNwq enLvml3iAW3XEBDkyssPd/shD7h9pLiR2GqptwUWVObgyyu86jIBV28JUXBA9QAifCEW ztVe9vQMt9qquG8dIT4IXqs5fLdmrQ3Ba9e0v/6mITnakpbEoRk0n3fbRISroIgKkEYk 7c8+ZhBP+fVYDcnO4PALWB8vaHo26PcEzOkAZaU1nkKYaHGAuhvEcD2bH7FtV6vO9WCj GmMg== X-Forwarded-Encrypted: i=1; AFNElJ+/VKA08pWaBk+yfZjt19rO6PxKZZMWfAKVRxhIAClDuzviUtDTspf9UOCMWIdxsXOP9mGqtdHMiJ3eyW8=@vger.kernel.org X-Gm-Message-State: AOJu0YyaNJtlC+z3iEAWWtqPAxCjulOI97LkK/enf6uFSPQ2T5MxjoTI MlRg2Zem8O7PMGUt8Pqk5UUAOx7hbW56S9amyhs1UV2WEVcKINT4QE+g/YkZZw== X-Gm-Gg: Acq92OFk67W4WhK6Pr/6p0xpCeDaJLEctX1UFq1AtI8E4eCa/isQGS3SxvwBYV7ISbe XjW0W1v6THRpdVotCep9ENNTIxb+3Dzd+TxyLyMx5tj1peed+uzxJvaGNFeVcfJJ+5mddBy2EX6 BYDNuGRhpyhwkYQnImNWyOFzibdVblg8GrOwEZfvSomes8RHLaz5KiNyyKgquVzNEfupROWkvyZ zl1z7E+M50YwIo7p6rqrtVh2TOze7cwt3DcB9xCEsY3LdhNHcWPoCLgUziJ/LJId2PjDq7e8v4/ lfVjMAckNKjelAU/r88IrxwHMBs2QzTZkkHciTAOekSWE9PMT4gQR3QHnSwpECG1cknA4sYel5+ bCOFS/wya+jiDfnaDe2KB2aYxGnaYAbQsdg8kwlyYxYtfiX3yEtJHHLxScrp0YXjCOB2l6bH9eZ 175v9ArEBfwJCD6AYaIM2sIRnbzOg8f2YzlmpGF9CYbfemVjlT X-Received: by 2002:a05:6a20:4328:b0:3a2:cf6d:aa04 with SMTP id adf61e73a8af0-3b0bdd00bfbmr25707703637.8.1779266562681; Wed, 20 May 2026 01:42:42 -0700 (PDT) Received: from csl-conti-dell7858.ntu.edu.sg ([155.69.195.57]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c82bb123986sm20808751a12.32.2026.05.20.01.42.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2026 01:42:42 -0700 (PDT) From: Maoyi Xie To: netdev@vger.kernel.org Cc: Allison Henderson , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Praveen Kakkolangara , rds-devel@oss.oracle.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net v6] rds: filter RDS_INFO_* getsockopt by caller's netns Date: Wed, 20 May 2026 16:42:36 +0800 Message-Id: <20260520084236.2724349-1-maoyixie.tju@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The RDS_INFO_* family of getsockopt(2) options reads several file-scope global lists that are not per-netns: rds_sock_info / rds6_sock_info, rds_sock_inc_info / rds6_sock_inc_info -> rds_sock_list rds_tcp_tc_info / rds6_tcp_tc_info -> rds_tcp_tc_list rds_conn_info / rds6_conn_info, rds_conn_message_info_cmn (for the *_SEND_MESSAGES and *_RETRANS_MESSAGES variants), rds_for_each_conn_info (for RDS_INFO_IB_CONNECTIONS) -> rds_conn_hash[] The handlers do not filter by the caller's network namespace. rds_info_getsockopt() has no netns or capable() check, and rds_create() has no capable() check, so AF_RDS is reachable from an unprivileged user namespace. As a result, an unprivileged caller in a fresh user_ns plus netns can read the bound address and sock inode of every RDS socket on the host, the peer address of incoming messages on every RDS socket on the host, the peer address and TCP sequence numbers of every rds-tcp connection on the host, and the peer address and RDS sequence numbers of every RDS connection on the host. The rds-tcp transport is reachable from a non-initial netns (see rds_set_transport()), so a one-shot init_net gate at rds_info_getsockopt() would deny legitimate per-netns visibility to rds-tcp callers. Instead, filter at each handler by comparing the netns of the caller's socket to the netns of the list entry, or to rds_conn_net(conn) for connection paths. Only copy entries whose netns matches the caller. Counters (RDS_INFO_COUNTERS) are aggregate statistics and remain global. Reproducer (KASAN VM, rds and rds_tcp loaded): an AF_RDS socket binds 127.0.0.1:4242 in init_net as root. A child process enters a fresh user_ns plus netns and opens AF_RDS there, then calls getsockopt(SOL_RDS, RDS_INFO_SOCKETS). Before this change, the child sees the init_net socket. After this change, the child sees zero entries. Drop the rds_sock_count, rds_tcp_tc_count, and rds6_tcp_tc_count globals. v2 used them for the size precheck and lens->nr; v3 replaced the precheck with a per-ns count from a first pass over the list, so the globals have no remaining readers. The matching increments and decrements in rds_create()/rds_destroy_sock() and rds_tcp_set_callbacks()/rds_tcp_restore_callbacks() go away with them. Reported by the kernel test robot under clang W=3D1. Suggested-by: Allison Henderson Suggested-by: Simon Horman Reviewed-by: Allison Henderson Co-developed-by: Praveen Kakkolangara Signed-off-by: Praveen Kakkolangara Signed-off-by: Maoyi Xie --- v6: Rebased on current net main tip (dc416e32baae) after Jakub pw-bot:cr on v5. The patch failed to apply at net/rds/tcp.c:201 because the separately submitted "rds_tcp: close NULL deref window in rds_tcp_set_callbacks" patch landed in net main between v5 and v6. That commit reorders tc->t_sock =3D sock inside the rds_tcp_tc_list_lock and adds a comment block. The 3-way merge preserves both changes. No functional change in the netns filter behaviour relative to v5. v5: Address Simon Horman's review of v4. rds_bind() writes rs_bound_addr without holding rds_sock_lock (bind.c:123, 138, 160). rds_sock_info() holds rds_sock_lock across the two passes, but a concurrent rds_bind() can still change rs_bound_addr between them. The second pass can then match an entry the first pass did not count. For a len=3D0 probe with cnt=3D0, this reaches iter->pages while the buffer is still empty. Cap the second pass at the cnt from the first pass (if (copied >=3D cnt) break). The cap goes in all four handlers: rds_sock_info, rds6_sock_info, rds_tcp_tc_info, and rds6_tcp_tc_info. The reverse case is also handled. If pass2 copies fewer entries than cnt because of a concurrent rds_remove_bound(), set cnt =3D copied before the out: label. lens->nr then reports what was actually written. v4: Drop the rds_sock_count, rds_tcp_tc_count, and rds6_tcp_tc_count globals reported by the kernel test robot under clang W=3D1 (-Wunused-but-set-global). They have no remaining readers after v3 replaced the size precheck with a per-ns count from the first pass. Inc/dec sites in rds_create()/rds_destroy_sock() and rds_tcp_set_callbacks()/rds_tcp_restore_callbacks() are removed with them. No functional change beyond v3 for the netns-filter behaviour. Note for applier: this patch touches rds_tcp_set_callbacks() in the same hunk window as a separate [PATCH net] sent earlier (Message-Id <20260512142807.1855619-1-maoyi.xie@ntu.edu.sg>, "rds_tcp: close NULL deref window in rds_tcp_set_callbacks"). Both apply cleanly to net/main tip b266bacba in isolation. When applied together the second one needs a trivial 3-way merge in net/rds/tcp.c. v3: Address Simon Horman's review of v2. The size precheck and the lens count are now both restricted to the caller's netns in rds_sock_info, rds6_sock_info, rds_tcp_tc_info and rds6_tcp_tc_info. Each handler now does a first pass under the list lock to count entries visible in the caller's netns, then short-circuits with that count if the user buffer is too small, then a second pass to fill data. This closes both issues Simon flagged: a zero-length probe no longer returns the global count, and a caller that sizes its buffer to the value returned by lens no longer hits ENOSPC on the second call. Re-verified on KASAN VM with the v1 PoC: attacker in fresh user_ns + netns sees zero RDS_INFO_SOCKETS entries; init_net access sees its own entries; lens returns the ns-scoped count on both probe and full reads. v2: rebased onto net/main tip (b266bacba) so patchwork can apply. No code changes. Carries forward Reviewed-by from v1 review. v1: https://lore.kernel.org/r/20260506075031.2238596-1-maoyixie.tju@gmail.c= om net/rds/af_rds.c | 59 ++++++++++++++++++++++++++++++++++++++++------- net/rds/connection.c | 13 +++++++++++ net/rds/tcp.c | 63 ++++++++++++++++++++++++++++++++++--------------= --- 3 files changed, 104 insertions(+), 31 deletions(-) diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index 76f625986a7f2..bd6271bab32cc 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -43,7 +43,6 @@ =20 /* this is just used for stats gathering :/ */ static DEFINE_SPINLOCK(rds_sock_lock); -static unsigned long rds_sock_count; static LIST_HEAD(rds_sock_list); DECLARE_WAIT_QUEUE_HEAD(rds_poll_waitq); =20 @@ -82,7 +81,6 @@ static int rds_release(struct socket *sock) =20 spin_lock_bh(&rds_sock_lock); list_del_init(&rs->rs_item); - rds_sock_count--; spin_unlock_bh(&rds_sock_lock); =20 rds_trans_put(rs->rs_transport); @@ -694,7 +692,6 @@ static int __rds_create(struct socket *sock, struct soc= k *sk, int protocol) =20 spin_lock_bh(&rds_sock_lock); list_add_tail(&rs->rs_item, &rds_sock_list); - rds_sock_count++; spin_unlock_bh(&rds_sock_lock); =20 return 0; @@ -735,6 +732,7 @@ static void rds_sock_inc_info(struct socket *sock, unsi= gned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_sock *rs; struct rds_incoming *inc; unsigned int total =3D 0; @@ -744,6 +742,9 @@ static void rds_sock_inc_info(struct socket *sock, unsi= gned int len, spin_lock_bh(&rds_sock_lock); =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; /* This option only supports IPv4 sockets. */ if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) continue; @@ -774,6 +775,7 @@ static void rds6_sock_inc_info(struct socket *sock, uns= igned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_incoming *inc; unsigned int total =3D 0; struct rds_sock *rs; @@ -783,6 +785,9 @@ static void rds6_sock_inc_info(struct socket *sock, uns= igned int len, spin_lock_bh(&rds_sock_lock); =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; read_lock(&rs->rs_recv_lock); =20 list_for_each_entry(inc, &rs->rs_recv_queue, i_item) { @@ -806,20 +811,34 @@ static void rds_sock_info(struct socket *sock, unsign= ed int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_info_socket sinfo; unsigned int cnt =3D 0; + unsigned int copied =3D 0; struct rds_sock *rs; =20 len /=3D sizeof(struct rds_info_socket); =20 spin_lock_bh(&rds_sock_lock); =20 - if (len < rds_sock_count) { - cnt =3D rds_sock_count; - goto out; + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; + if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) + continue; + cnt++; } =20 + if (len < cnt) + goto out; + list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (copied >=3D cnt) + break; + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; /* This option only supports IPv4 sockets. */ if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) continue; @@ -832,8 +851,13 @@ static void rds_sock_info(struct socket *sock, unsigne= d int len, sinfo.inum =3D sock_i_ino(rds_rs_to_sk(rs)); =20 rds_info_copy(iter, &sinfo, sizeof(sinfo)); - cnt++; + copied++; } + /* A concurrent rds_bind() can change rs_bound_addr between the + * two passes without holding rds_sock_lock, so copied may be + * less than cnt. Report what was actually copied. + */ + cnt =3D copied; =20 out: lens->nr =3D cnt; @@ -847,17 +871,32 @@ static void rds6_sock_info(struct socket *sock, unsig= ned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds6_info_socket sinfo6; + unsigned int cnt =3D 0; + unsigned int copied =3D 0; struct rds_sock *rs; =20 len /=3D sizeof(struct rds6_info_socket); =20 spin_lock_bh(&rds_sock_lock); =20 - if (len < rds_sock_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; + cnt++; + } + + if (len < cnt) goto out; =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (copied >=3D cnt) + break; + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; sinfo6.sndbuf =3D rds_sk_sndbuf(rs); sinfo6.rcvbuf =3D rds_sk_rcvbuf(rs); sinfo6.bound_addr =3D rs->rs_bound_addr; @@ -867,10 +906,12 @@ static void rds6_sock_info(struct socket *sock, unsig= ned int len, sinfo6.inum =3D sock_i_ino(rds_rs_to_sk(rs)); =20 rds_info_copy(iter, &sinfo6, sizeof(sinfo6)); + copied++; } + cnt =3D copied; =20 out: - lens->nr =3D rds_sock_count; + lens->nr =3D cnt; lens->each =3D sizeof(struct rds6_info_socket); =20 spin_unlock_bh(&rds_sock_lock); diff --git a/net/rds/connection.c b/net/rds/connection.c index c10b7ed06c49f..7c8ab8e973e1b 100644 --- a/net/rds/connection.c +++ b/net/rds/connection.c @@ -568,6 +568,7 @@ static void rds_conn_message_info_cmn(struct socket *so= ck, unsigned int len, struct rds_info_lengths *lens, int want_send, bool isv6) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct list_head *list; struct rds_connection *conn; @@ -590,6 +591,9 @@ static void rds_conn_message_info_cmn(struct socket *so= ck, unsigned int len, struct rds_conn_path *cp; int npaths; =20 + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; if (!isv6 && conn->c_isv6) continue; =20 @@ -688,6 +692,7 @@ void rds_for_each_conn_info(struct socket *sock, unsign= ed int len, u64 *buffer, size_t item_len) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct rds_connection *conn; size_t i; @@ -700,6 +705,9 @@ void rds_for_each_conn_info(struct socket *sock, unsign= ed int len, for (i =3D 0, head =3D rds_conn_hash; i < ARRAY_SIZE(rds_conn_hash); i++, head++) { hlist_for_each_entry_rcu(conn, head, c_hash_node) { + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; =20 /* Zero the per-item buffer before handing it to the * visitor so any field the visitor does not write - @@ -733,6 +741,7 @@ static void rds_walk_conn_path_info(struct socket *sock= , unsigned int len, u64 *buffer, size_t item_len) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct rds_connection *conn; size_t i; @@ -747,6 +756,10 @@ static void rds_walk_conn_path_info(struct socket *soc= k, unsigned int len, hlist_for_each_entry_rcu(conn, head, c_hash_node) { struct rds_conn_path *cp; =20 + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; + /* XXX We only copy the information from the first * path for now. The problem is that if there are * more than one underlying paths, we cannot report diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 5830b31a1f37b..fbaf150e3edaa 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -46,14 +46,6 @@ static DEFINE_SPINLOCK(rds_tcp_tc_list_lock); static LIST_HEAD(rds_tcp_tc_list); =20 -/* rds_tcp_tc_count counts only IPv4 connections. - * rds6_tcp_tc_count counts both IPv4 and IPv6 connections. - */ -static unsigned int rds_tcp_tc_count; -#if IS_ENABLED(CONFIG_IPV6) -static unsigned int rds6_tcp_tc_count; -#endif - /* Track rds_tcp_connection structs so they can be cleaned up */ static DEFINE_SPINLOCK(rds_tcp_conn_lock); static LIST_HEAD(rds_tcp_conn_list); @@ -110,11 +102,6 @@ void rds_tcp_restore_callbacks(struct socket *sock, /* done under the callback_lock to serialize with write_space */ spin_lock(&rds_tcp_tc_list_lock); list_del_init(&tc->t_list_item); -#if IS_ENABLED(CONFIG_IPV6) - rds6_tcp_tc_count--; -#endif - if (!tc->t_cpath->cp_conn->c_isv6) - rds_tcp_tc_count--; spin_unlock(&rds_tcp_tc_list_lock); =20 tc->t_sock =3D NULL; @@ -206,11 +193,6 @@ void rds_tcp_set_callbacks(struct socket *sock, struct= rds_conn_path *cp) spin_lock(&rds_tcp_tc_list_lock); tc->t_sock =3D sock; list_add_tail(&tc->t_list_item, &rds_tcp_tc_list); -#if IS_ENABLED(CONFIG_IPV6) - rds6_tcp_tc_count++; -#endif - if (!tc->t_cpath->cp_conn->c_isv6) - rds_tcp_tc_count++; spin_unlock(&rds_tcp_tc_list_lock); =20 /* accepted sockets need our listen data ready undone */ @@ -238,20 +220,37 @@ static void rds_tcp_tc_info(struct socket *rds_sock, = unsigned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(rds_sock->sk); struct rds_info_tcp_socket tsinfo; struct rds_tcp_connection *tc; + unsigned int cnt =3D 0; + unsigned int copied =3D 0; unsigned long flags; =20 spin_lock_irqsave(&rds_tcp_tc_list_lock, flags); =20 - if (len / sizeof(tsinfo) < rds_tcp_tc_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { + if (tc->t_cpath->cp_conn->c_isv6) + continue; + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + cnt++; + } + + if (len / sizeof(tsinfo) < cnt) goto out; =20 list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { struct inet_sock *inet =3D inet_sk(tc->t_sock->sk); =20 + if (copied >=3D cnt) + break; if (tc->t_cpath->cp_conn->c_isv6) continue; + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; =20 tsinfo.local_addr =3D inet->inet_saddr; tsinfo.local_port =3D inet->inet_sport; @@ -266,10 +265,12 @@ static void rds_tcp_tc_info(struct socket *rds_sock, = unsigned int len, tsinfo.tos =3D tc->t_cpath->cp_conn->c_tos; =20 rds_info_copy(iter, &tsinfo, sizeof(tsinfo)); + copied++; } + cnt =3D copied; =20 out: - lens->nr =3D rds_tcp_tc_count; + lens->nr =3D cnt; lens->each =3D sizeof(tsinfo); =20 spin_unlock_irqrestore(&rds_tcp_tc_list_lock, flags); @@ -284,19 +285,35 @@ static void rds6_tcp_tc_info(struct socket *sock, uns= igned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds6_info_tcp_socket tsinfo6; struct rds_tcp_connection *tc; + unsigned int cnt =3D 0; + unsigned int copied =3D 0; unsigned long flags; =20 spin_lock_irqsave(&rds_tcp_tc_list_lock, flags); =20 - if (len / sizeof(tsinfo6) < rds6_tcp_tc_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + cnt++; + } + + if (len / sizeof(tsinfo6) < cnt) goto out; =20 list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { struct sock *sk =3D tc->t_sock->sk; struct inet_sock *inet =3D inet_sk(sk); =20 + if (copied >=3D cnt) + break; + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + tsinfo6.local_addr =3D sk->sk_v6_rcv_saddr; tsinfo6.local_port =3D inet->inet_sport; tsinfo6.peer_addr =3D sk->sk_v6_daddr; @@ -309,10 +326,12 @@ static void rds6_tcp_tc_info(struct socket *sock, uns= igned int len, tsinfo6.last_seen_una =3D tc->t_last_seen_una; =20 rds_info_copy(iter, &tsinfo6, sizeof(tsinfo6)); + copied++; } + cnt =3D copied; =20 out: - lens->nr =3D rds6_tcp_tc_count; + lens->nr =3D cnt; lens->each =3D sizeof(tsinfo6); =20 spin_unlock_irqrestore(&rds_tcp_tc_list_lock, flags); --=20 2.34.1