From nobody Sat Jun 13 03:33:28 2026 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F640311946 for ; Mon, 11 May 2026 07:02:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778482941; cv=none; b=cJXLpMkDDJ5i+oD8RKzttkuwXuP1MPtm+3vz2bBm7bpLDGVWQR7BqaKbh0QYnRz2NmHMxUKjpfc8D7AOTch/OVJV9XSYFKVT8gk7rR1MyLVfu0IDfVTmYA+GOuZCffRbwOSt/6gbd3FCYikA50QHlr8hqPjDcBHMWCGczqIJojg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778482941; c=relaxed/simple; bh=QFy4gBqDtUEsal4sJ+4y4+2D6x4MfUMpx6PYX9HGBtk=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=ChoCVRs/J79RC78/mzQ2E+rXCvEZjv7IqZdlL57l61MCj8J9Xs5rtRpskLDpPj9B04SH2jlRVlb5Km5uHeRIaflvGehwAasOkx1EisTVA9H270s9Mu09S00YrtRZ/8Izfw5BNGoekTo2daqaZ9xnbeWTrNSP3B5bDq6vdZpsufc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nscXj4DY; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nscXj4DY" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-36608b2f2dcso2496703a91.2 for ; Mon, 11 May 2026 00:02:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778482938; x=1779087738; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=gLQqNeBBoJ1TovFHZ6Y5VIkiSO2gp7MmGpfE8S7ANoU=; b=nscXj4DY4BJcLy1FgjrC0fAKr6tyYm+6ruQSj4zT86zlU9l0NymQGyEPv4Mng1f+1z 06gAY+7Yb8bDDKhSeIFe/7Q/pTmOV9SBLwppIIz51jB25rVy9jw0ARS84fIhMIu1EeIa DT5CKMSfUGn256WtAgybdHjZF7uMxu0ZfOAiM64Kt/97889BbDN2B7moe/srKtDLa0Ie vI4/klkgPS2lPNg0x8UVl2Lh7fO198GLNmPM11tLMpKtQg/w7YRaP2ao6fn2jBSQFJhN DC6PVJH4JKMIFCQdVkL+68tG1yWTfl/sVOHfRHIF/zMog3infJc+KrlGvpsX2XalX4XJ ACUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778482938; x=1779087738; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gLQqNeBBoJ1TovFHZ6Y5VIkiSO2gp7MmGpfE8S7ANoU=; b=bzTm1/eJhKMaiAUxeUyR72F9ZVNL1VmekB1JXwXhdyzYUJ4C34a1Sc1ZPX8M0woGOA pJA+yxngnqJn+N1mtqhl8DKKHzY8KfAx23t2TiNM+hx1SEQeoHZ8dVZWGPJWnMZOO4q+ TIPVaW6KhNNFWpIFZiaSffGADkBeoeUY9EKVRnSZ6+DhLBbzOEGmCOmqp7ipxYSQvF2B N310FVqS3ZbKzFsaQkDUlQ030Xu0O/L0EaG8rBoOCOhaEt2yY5u5Ttf7HoMIcpNMg3XF KW5qJHDgD6lJ6hrMACqRWeRjWEAKIuseYScw/lyrcCz8QUZ2Q7FBAOncyAgIBxfQ6uAN lQDA== X-Forwarded-Encrypted: i=1; AFNElJ8dnNkR6LzW8NWMgYSRlOP6Nn1NQHJzCoHKOaI0irg+LTm/jDUvRGthrEYAbcMLE6lC2hhu69fHu56UuP8=@vger.kernel.org X-Gm-Message-State: AOJu0Yxn5p5TTuZ3AHXYWHkCg7Pk/QRd3GaPtDW7ZIKEqL+V/FAcw0OS bGXCmCvpJwkYE5Agn9h6A+onF4n6f8EdjXq37uC+r5qWLemDLZFE7Kuk X-Gm-Gg: Acq92OHl8URpdB3cmLk3wldOWXBnp+kXorxurynG6KyyceTK0w93axWFL7F016tVxUL Egcv9Dz8JqNoyEdNb/oAGQXbr6o4KJB3SIwH5g8JAviq3qvyMFoQaiVWbCrNZgsYXjpwHsyIHZF hayHMJA/7lVllUuOLTmRLBuQRc0uJKLHsk3i00PAZpiGQFNkow+/uRHzrc3DO/BkPGt2Yfz7mEA zaXjhuUEPb2mwfWoD8IV5Fzg7qiiBg60m7rWVD25mULz2HGClxVyH6depzJ4wULFGcNc+eX+L/U yITsPktcbdhQZmOBdtW0YOLGa+smzRhXw44CzVONkGw55GNQN3VPpJ1XatZgt0v0FA0IcuijcwT gat0ckggNXDHcd7yoXE6uL4tcI+w8Q1LTOwEvXL8Xz1w+R4HIugIYngViHQOhlRt1KqO8ZKyrPj w6+AauQolhf+b3w4o93uZz/ydm2cCudYKfMKFIYY/3ghQTIruPj+unKm+0 X-Received: by 2002:a17:90b:1c8b:b0:362:f860:f9ba with SMTP id 98e67ed59e1d1-365ab9b8e33mr23465932a91.1.1778482937476; Mon, 11 May 2026 00:02:17 -0700 (PDT) Received: from csl-conti-dell7858.ntu.edu.sg ([155.69.195.57]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-367beac2c7dsm3274292a91.5.2026.05.11.00.02.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2026 00:02:16 -0700 (PDT) From: Maoyi Xie X-Google-Original-From: Maoyi Xie To: Simon Horman , Allison Henderson , netdev@vger.kernel.org Cc: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com, linux-kernel@vger.kernel.org, Praveen Kakkolangara , Maoyi Xie Subject: [PATCH net v3] rds: filter RDS_INFO_* getsockopt by caller's netns Date: Mon, 11 May 2026 15:02:11 +0800 Message-Id: <20260511070211.1033178-1-maoyi.xie@ntu.edu.sg> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The RDS_INFO_* family of getsockopt(2) options reads several file-scope global lists that are not per-netns: rds_sock_info / rds6_sock_info, rds_sock_inc_info / rds6_sock_inc_info -> rds_sock_list rds_tcp_tc_info / rds6_tcp_tc_info -> rds_tcp_tc_list rds_conn_info / rds6_conn_info, rds_conn_message_info_cmn (for the *_SEND_MESSAGES and *_RETRANS_MESSAGES variants), rds_for_each_conn_info (for RDS_INFO_IB_CONNECTIONS) -> rds_conn_hash[] The handlers do not filter by the caller's network namespace. rds_info_getsockopt() has no netns or capable() check, and rds_create() has no capable() check, so AF_RDS is reachable from an unprivileged user namespace. As a result, an unprivileged caller in a fresh user_ns plus netns can read the bound address and sock inode of every RDS socket on the host, the peer address of incoming messages on every RDS socket on the host, the peer address and TCP sequence numbers of every rds-tcp connection on the host, and the peer address and RDS sequence numbers of every RDS connection on the host. The rds-tcp transport is reachable from a non-initial netns (see rds_set_transport()), so a one-shot init_net gate at rds_info_getsockopt() would deny legitimate per-netns visibility to rds-tcp callers. Instead, filter at each handler by comparing the netns of the caller's socket to the netns of the list entry, or to rds_conn_net(conn) for connection paths. Only copy entries whose netns matches the caller. Counters (RDS_INFO_COUNTERS) are aggregate statistics and remain global. Reproducer (KASAN VM, rds and rds_tcp loaded): an AF_RDS socket binds 127.0.0.1:4242 in init_net as root. A child process enters a fresh user_ns plus netns and opens AF_RDS there, then calls getsockopt(SOL_RDS, RDS_INFO_SOCKETS). Before this change, the child sees the init_net socket. After this change, the child sees zero entries. Suggested-by: Allison Henderson Suggested-by: Simon Horman Reviewed-by: Allison Henderson Co-developed-by: Praveen Kakkolangara Signed-off-by: Praveen Kakkolangara Signed-off-by: Maoyi Xie --- v3: Address Simon Horman's review of v2. The size precheck and the lens count are now both restricted to the caller's netns in rds_sock_info, rds6_sock_info, rds_tcp_tc_info and rds6_tcp_tc_info. Each handler now does a first pass under the list lock to count entries visible in the caller's netns, then short-circuits with that count if the user buffer is too small, then a second pass to fill data. This closes both issues Simon flagged: a zero-length probe no longer returns the global count, and a caller that sizes its buffer to the value returned by lens no longer hits ENOSPC on the second call. Re-verified on KASAN VM with the v1 PoC: attacker in fresh user_ns + netns sees zero RDS_INFO_SOCKETS entries; init_net access sees its own entries; lens returns the ns-scoped count on both probe and full reads. v2: rebased onto net/main tip (b266bacba) so patchwork can apply. No code changes. Carries forward Reviewed-by from v1 review. v1: https://lore.kernel.org/r/20260506075031.2238596-1-maoyixie.tju@gmail.c= om net/rds/af_rds.c | 42 ++++++++++++++++++++++++++++++++++++------ net/rds/connection.c | 13 +++++++++++++ net/rds/tcp.c | 35 +++++++++++++++++++++++++++++++---- 3 files changed, 80 insertions(+), 10 deletions(-) diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index 76f625986..6e22b516b 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -735,6 +735,7 @@ static void rds_sock_inc_info(struct socket *sock, unsi= gned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_sock *rs; struct rds_incoming *inc; unsigned int total =3D 0; @@ -744,6 +745,9 @@ static void rds_sock_inc_info(struct socket *sock, unsi= gned int len, spin_lock_bh(&rds_sock_lock); =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; /* This option only supports IPv4 sockets. */ if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) continue; @@ -774,6 +778,7 @@ static void rds6_sock_inc_info(struct socket *sock, uns= igned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_incoming *inc; unsigned int total =3D 0; struct rds_sock *rs; @@ -783,6 +788,9 @@ static void rds6_sock_inc_info(struct socket *sock, uns= igned int len, spin_lock_bh(&rds_sock_lock); =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; read_lock(&rs->rs_recv_lock); =20 list_for_each_entry(inc, &rs->rs_recv_queue, i_item) { @@ -806,6 +814,7 @@ static void rds_sock_info(struct socket *sock, unsigned= int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds_info_socket sinfo; unsigned int cnt =3D 0; struct rds_sock *rs; @@ -814,12 +823,22 @@ static void rds_sock_info(struct socket *sock, unsign= ed int len, =20 spin_lock_bh(&rds_sock_lock); =20 - if (len < rds_sock_count) { - cnt =3D rds_sock_count; - goto out; + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; + if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) + continue; + cnt++; } =20 + if (len < cnt) + goto out; + list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; /* This option only supports IPv4 sockets. */ if (!ipv6_addr_v4mapped(&rs->rs_bound_addr)) continue; @@ -832,7 +851,6 @@ static void rds_sock_info(struct socket *sock, unsigned= int len, sinfo.inum =3D sock_i_ino(rds_rs_to_sk(rs)); =20 rds_info_copy(iter, &sinfo, sizeof(sinfo)); - cnt++; } =20 out: @@ -847,17 +865,29 @@ static void rds6_sock_info(struct socket *sock, unsig= ned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds6_info_socket sinfo6; + unsigned int cnt =3D 0; struct rds_sock *rs; =20 len /=3D sizeof(struct rds6_info_socket); =20 spin_lock_bh(&rds_sock_lock); =20 - if (len < rds_sock_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(rs, &rds_sock_list, rs_item) { + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; + cnt++; + } + + if (len < cnt) goto out; =20 list_for_each_entry(rs, &rds_sock_list, rs_item) { + /* Only show sockets in the caller's netns. */ + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net)) + continue; sinfo6.sndbuf =3D rds_sk_sndbuf(rs); sinfo6.rcvbuf =3D rds_sk_rcvbuf(rs); sinfo6.bound_addr =3D rs->rs_bound_addr; @@ -870,7 +900,7 @@ static void rds6_sock_info(struct socket *sock, unsigne= d int len, } =20 out: - lens->nr =3D rds_sock_count; + lens->nr =3D cnt; lens->each =3D sizeof(struct rds6_info_socket); =20 spin_unlock_bh(&rds_sock_lock); diff --git a/net/rds/connection.c b/net/rds/connection.c index c10b7ed06..7c8ab8e97 100644 --- a/net/rds/connection.c +++ b/net/rds/connection.c @@ -568,6 +568,7 @@ static void rds_conn_message_info_cmn(struct socket *so= ck, unsigned int len, struct rds_info_lengths *lens, int want_send, bool isv6) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct list_head *list; struct rds_connection *conn; @@ -590,6 +591,9 @@ static void rds_conn_message_info_cmn(struct socket *so= ck, unsigned int len, struct rds_conn_path *cp; int npaths; =20 + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; if (!isv6 && conn->c_isv6) continue; =20 @@ -688,6 +692,7 @@ void rds_for_each_conn_info(struct socket *sock, unsign= ed int len, u64 *buffer, size_t item_len) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct rds_connection *conn; size_t i; @@ -700,6 +705,9 @@ void rds_for_each_conn_info(struct socket *sock, unsign= ed int len, for (i =3D 0, head =3D rds_conn_hash; i < ARRAY_SIZE(rds_conn_hash); i++, head++) { hlist_for_each_entry_rcu(conn, head, c_hash_node) { + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; =20 /* Zero the per-item buffer before handing it to the * visitor so any field the visitor does not write - @@ -733,6 +741,7 @@ static void rds_walk_conn_path_info(struct socket *sock= , unsigned int len, u64 *buffer, size_t item_len) { + struct net *net =3D sock_net(sock->sk); struct hlist_head *head; struct rds_connection *conn; size_t i; @@ -747,6 +756,10 @@ static void rds_walk_conn_path_info(struct socket *soc= k, unsigned int len, hlist_for_each_entry_rcu(conn, head, c_hash_node) { struct rds_conn_path *cp; =20 + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(conn), net)) + continue; + /* XXX We only copy the information from the first * path for now. The problem is that if there are * more than one underlying paths, we cannot report diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 654e23d13..105e83507 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -235,13 +235,24 @@ static void rds_tcp_tc_info(struct socket *rds_sock, = unsigned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(rds_sock->sk); struct rds_info_tcp_socket tsinfo; struct rds_tcp_connection *tc; + unsigned int cnt =3D 0; unsigned long flags; =20 spin_lock_irqsave(&rds_tcp_tc_list_lock, flags); =20 - if (len / sizeof(tsinfo) < rds_tcp_tc_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { + if (tc->t_cpath->cp_conn->c_isv6) + continue; + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + cnt++; + } + + if (len / sizeof(tsinfo) < cnt) goto out; =20 list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { @@ -249,6 +260,9 @@ static void rds_tcp_tc_info(struct socket *rds_sock, un= signed int len, =20 if (tc->t_cpath->cp_conn->c_isv6) continue; + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; =20 tsinfo.local_addr =3D inet->inet_saddr; tsinfo.local_port =3D inet->inet_sport; @@ -266,7 +280,7 @@ static void rds_tcp_tc_info(struct socket *rds_sock, un= signed int len, } =20 out: - lens->nr =3D rds_tcp_tc_count; + lens->nr =3D cnt; lens->each =3D sizeof(tsinfo); =20 spin_unlock_irqrestore(&rds_tcp_tc_list_lock, flags); @@ -281,19 +295,32 @@ static void rds6_tcp_tc_info(struct socket *sock, uns= igned int len, struct rds_info_iterator *iter, struct rds_info_lengths *lens) { + struct net *net =3D sock_net(sock->sk); struct rds6_info_tcp_socket tsinfo6; struct rds_tcp_connection *tc; + unsigned int cnt =3D 0; unsigned long flags; =20 spin_lock_irqsave(&rds_tcp_tc_list_lock, flags); =20 - if (len / sizeof(tsinfo6) < rds6_tcp_tc_count) + /* First pass: count entries visible in the caller's netns. */ + list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + cnt++; + } + + if (len / sizeof(tsinfo6) < cnt) goto out; =20 list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) { struct sock *sk =3D tc->t_sock->sk; struct inet_sock *inet =3D inet_sk(sk); =20 + /* Only show connections in the caller's netns. */ + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net)) + continue; + tsinfo6.local_addr =3D sk->sk_v6_rcv_saddr; tsinfo6.local_port =3D inet->inet_sport; tsinfo6.peer_addr =3D sk->sk_v6_daddr; @@ -309,7 +336,7 @@ static void rds6_tcp_tc_info(struct socket *sock, unsig= ned int len, } =20 out: - lens->nr =3D rds6_tcp_tc_count; + lens->nr =3D cnt; lens->each =3D sizeof(tsinfo6); =20 spin_unlock_irqrestore(&rds_tcp_tc_list_lock, flags); base-commit: b266bacba796ff5c4dcd2ae2fc08aacf7ab39153 --=20 2.34.1