From nobody Sun Feb 8 23:34:56 2026 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E0F72EA159 for ; Wed, 12 Nov 2025 06:55:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762930548; cv=none; b=jHNf4z51ApZznkVe53yy4nuasP1/Y4gn+2mHnXX9XQBl2VjTOnWsAnFaGj1rPl3n+wP6n1hePVVR8mPGJitp0ZeQmpdr/xg4Tg27XlEXqtYXG8BireJpNKcltU9GUp8CjiI9O90hkpz8zmYFoivBvlZC2v9d/L0xiJjLupDK1XM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762930548; c=relaxed/simple; bh=t3Mr2b2gsOk7wPSdnxB9LNMHOkokWCDokVYefblhvtI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZZkw03h3EU0G8Jr+d5yQdhLh8ujng3sYsC4YRe/LXnBjKtJAQZBHZqS5x857bPpF9sX/7EcYcD8+vyASpqW86rBqXVyCkZXeLBktZXTIxgFeYEjHrAR3Yt/VUlEe4P9YNaAgaulMwYRn2yq/hLNhBegjFNnyQdsyAfG8cQtkLfI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PVjahz4J; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PVjahz4J" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-7afc154e411so310295b3a.1 for ; Tue, 11 Nov 2025 22:55:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762930541; x=1763535341; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=mie3ZonY5nqk3uh/MZASToKCQwxL+qxHRmkzFBuqQWw=; b=PVjahz4J6q+TdrrIPeecF/CpSs/14F/NNg5rkD23kJmfBfpQVNmplyiiqdkTsLHZq/ 2PXfw9KBqkvR6M5dQVFUH7+0tNRttBhtcCFiCv3hwHGecx3CN71ycYU8fZOIOJrD/q+i jkI3KXGsOzus4y5JgplczX0ouxA9Yloj4xkFG2QakWjDjZCAz471S67YE9Ig49oLsq4d WGcnvWQStMPj7A3xk2NrPLnVyf3bhyM5CHIBdN1JblDWK9RY3YuhHkrVo3PGD0NxMUv7 nmUnNpgN3nR1F9VQUieiQoQBndDCsjsTARHWGxQwHYVk45i8QP2CjXocfkw7+0SZYSEJ Q6Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762930541; x=1763535341; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=mie3ZonY5nqk3uh/MZASToKCQwxL+qxHRmkzFBuqQWw=; b=ZOWpo5lh9tht7ahn3aU9JAaueMyniKe+1X7aZoRyYgwZ78LqZD7ghqnUil1q3pm/E+ cxiCEfHWS3v39SOjNL/6yEBMSc5xJMwRqJ6s5NnObcjlYsZiP6ycgf5yN0FJGD+iNOGX yIsxXZvDaFqE68/oOsafY0jIfABIR2OAXYOGUaqR4ftITahi3lZCrLt86jXF/M61F577 eqPTyy4mgfmjmoDgRnW4pZo5tNEmS3jJ0CsYb1iya65tnlGBh4xHAORcNt0SFo7rd7FP DUqdWYpQ+SmSN3J97k33omsyvOOLL/xLmRXsbsJVzNfwR/d3LI5H+jKEa459vQlchAjT W9Hg== X-Forwarded-Encrypted: i=1; AJvYcCW9YPorCfGZwRPzOMl3YHcjbFzKdQsj8xHbYHttVua1gUabDpBz7xRMNL+ZvdhN3aba2I7oDl9azwiiY+Y=@vger.kernel.org X-Gm-Message-State: AOJu0YwvlWhXXB38f/elACORveV88Yy2oJU55b4zF3krlgJV+ymxq5/Z VQXc3wdjgeu7OKymGcqz/EpIKZM0C69Xvjr20XmYcdnEkkGhFvqDtBaM X-Gm-Gg: ASbGncvln3R3B74SMExhkoXoYI6DNH/4GRBKE+G/BXte7VhTlMpejUogplOtkc1Lqp2 Bbf1Lm2aDDgZkH+EyRNiRzj5LQVghUuP2jU05bjP1XS+9wPdXLj68AJNytAGMfhGU9eGi8p05kc dIMEUO16UTv864lLKDel5RYPjmrfhzu9Ddd2KBjmcIkgM4PUO0k3ksVybW/NCGEy//pUvJCTWY5 yIWTJRCRIGx88Hx3G4mXd/lkRvsJ4MrW2Pt5q/aEwSLC+ET8ihRHI13GFuhVIW/mUVV4FtsCR3R GswdGl2BzwHgeN8ahfDMrQzeDB2FbftJXCRyaI55eSTBiSGdnqZN5BrIyghU0RT7utLX+jQa+qI oQijBYSXh/cQzBr+Egqwfk1qVsPSNDH/a8Dj3DR3XKuO/v++KIfuTXxst3VZOzONL5KhRaa9g X-Google-Smtp-Source: AGHT+IGXZ7rfVWqB/ewlsvm7a/xd5BUW9p1lVk1z0FAiIDIraulNlsTqgpff+V8VlwM/dYWbNIS7mQ== X-Received: by 2002:a05:6a20:3d82:b0:342:a7cd:9214 with SMTP id adf61e73a8af0-35909f65ad7mr2812752637.23.1762930541432; Tue, 11 Nov 2025 22:55:41 -0800 (PST) Received: from localhost ([2a03:2880:2ff:3::]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bbf0ec4f68esm1664489a12.1.2025.11.11.22.55.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Nov 2025 22:55:41 -0800 (PST) From: Bobby Eshleman Date: Tue, 11 Nov 2025 22:54:49 -0800 Subject: [PATCH net-next v9 07/14] vhost/vsock: add netns support Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251111-vsock-vmtest-v9-7-852787a37bed@meta.com> References: <20251111-vsock-vmtest-v9-0-852787a37bed@meta.com> In-Reply-To: <20251111-vsock-vmtest-v9-0-852787a37bed@meta.com> To: Stefano Garzarella , Shuah Khan , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Stefan Hajnoczi , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Bryan Tan , Vishnu Dasa , Broadcom internal kernel review list , Bobby Eshleman Cc: virtualization@lists.linux.dev, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-hyperv@vger.kernel.org, Sargun Dhillon , berrange@redhat.com, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add the ability to isolate vhost-vsock flows using namespaces. The VM, via the vhost_vsock struct, inherits its namespace from the process that opens the vhost-vsock device. vhost_vsock lookup functions are modified to take into account the mode (e.g., if CIDs are matching but modes don't align, then return NULL). When namespace modes are evaluated during socket usage we always use the mode of the namespace at the time the vhost vsock device file was opened. If that namespace is later changed from "global" to "local" mode, the vsock will continue operating as if the change never happened (i.e., it is in "global" mode). This avoids breaking already established flows. vhost_vsock now acquires a reference to the namespace. Suggested-by: Sargun Dhillon Signed-off-by: Bobby Eshleman --- Changes in v9: - add more information about net_mode and rationale (changing modes) to both code comment and commit message Changes in v7: - remove the check_global flag of vhost_vsock_get(), that logic was both wrong and not necessary, reuse vsock_net_check_mode() instead - remove 'delete me' comment Changes in v5: - respect pid namespaces when assigning namespace to vhost_vsock --- drivers/vhost/vsock.c | 42 ++++++++++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 0a0e73405532..09f9321e4bc8 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -46,6 +46,11 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8); struct vhost_vsock { struct vhost_dev dev; struct vhost_virtqueue vqs[2]; + struct net *net; + netns_tracker ns_tracker; + + /* The ns mode at the time vhost_vsock was created */ + enum vsock_net_mode net_mode; =20 /* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */ struct hlist_node hash; @@ -67,7 +72,8 @@ static u32 vhost_transport_get_local_cid(void) /* Callers that dereference the return value must hold vhost_vsock_mutex o= r the * RCU read lock. */ -static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) +static struct vhost_vsock *vhost_vsock_get(u32 guest_cid, struct net *net, + enum vsock_net_mode mode) { struct vhost_vsock *vsock; =20 @@ -78,9 +84,9 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) if (other_cid =3D=3D 0) continue; =20 - if (other_cid =3D=3D guest_cid) + if (other_cid =3D=3D guest_cid && + vsock_net_check_mode(net, mode, vsock->net, vsock->net_mode)) return vsock; - } =20 return NULL; @@ -279,7 +285,7 @@ vhost_transport_send_pkt(struct sk_buff *skb, struct ne= t *net, rcu_read_lock(); =20 /* Find the vhost_vsock according to guest context id */ - vsock =3D vhost_vsock_get(le64_to_cpu(hdr->dst_cid)); + vsock =3D vhost_vsock_get(le64_to_cpu(hdr->dst_cid), net, net_mode); if (!vsock) { rcu_read_unlock(); kfree_skb(skb); @@ -306,7 +312,8 @@ vhost_transport_cancel_pkt(struct vsock_sock *vsk) rcu_read_lock(); =20 /* Find the vhost_vsock according to guest context id */ - vsock =3D vhost_vsock_get(vsk->remote_addr.svm_cid); + vsock =3D vhost_vsock_get(vsk->remote_addr.svm_cid, + sock_net(sk_vsock(vsk)), vsk->net_mode); if (!vsock) goto out; =20 @@ -463,11 +470,12 @@ static struct virtio_transport vhost_transport =3D { =20 static bool vhost_transport_seqpacket_allow(struct vsock_sock *vsk, u32 re= mote_cid) { + struct net *net =3D sock_net(sk_vsock(vsk)); struct vhost_vsock *vsock; bool seqpacket_allow =3D false; =20 rcu_read_lock(); - vsock =3D vhost_vsock_get(remote_cid); + vsock =3D vhost_vsock_get(remote_cid, net, vsk->net_mode); =20 if (vsock) seqpacket_allow =3D vsock->seqpacket_allow; @@ -538,8 +546,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_wor= k *work) if (le64_to_cpu(hdr->src_cid) =3D=3D vsock->guest_cid && le64_to_cpu(hdr->dst_cid) =3D=3D vhost_transport_get_local_cid()) - virtio_transport_recv_pkt(&vhost_transport, skb, NULL, - 0); + virtio_transport_recv_pkt(&vhost_transport, skb, + vsock->net, vsock->net_mode); else kfree_skb(skb); =20 @@ -654,8 +662,10 @@ static void vhost_vsock_free(struct vhost_vsock *vsock) =20 static int vhost_vsock_dev_open(struct inode *inode, struct file *file) { + struct vhost_virtqueue **vqs; struct vhost_vsock *vsock; + struct net *net; int ret; =20 /* This struct is large and allocation could fail, fall back to vmalloc @@ -671,6 +681,17 @@ static int vhost_vsock_dev_open(struct inode *inode, s= truct file *file) goto out; } =20 + net =3D current->nsproxy->net_ns; + vsock->net =3D get_net_track(net, &vsock->ns_tracker, GFP_KERNEL); + + /* Store the mode of the namespace at the time of creation. If this + * namespace later changes from "global" to "local", we want this vsock + * to continue operating normally and not suddenly break. For that + * reason, we save the mode here and later use it when performing + * socket lookups with vsock_net_check_mode() (see vhost_vsock_get()). + */ + vsock->net_mode =3D vsock_net_mode(net); + vsock->guest_cid =3D 0; /* no CID assigned yet */ vsock->seqpacket_allow =3D false; =20 @@ -710,7 +731,7 @@ static void vhost_vsock_reset_orphans(struct sock *sk) */ =20 /* If the peer is still valid, no need to reset connection */ - if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + if (vhost_vsock_get(vsk->remote_addr.svm_cid, sock_net(sk), vsk->net_mode= )) return; =20 /* If the close timeout is pending, let it expire. This avoids races @@ -755,6 +776,7 @@ static int vhost_vsock_dev_release(struct inode *inode,= struct file *file) virtio_vsock_skb_queue_purge(&vsock->send_pkt_queue); =20 vhost_dev_cleanup(&vsock->dev); + put_net_track(vsock->net, &vsock->ns_tracker); kfree(vsock->dev.vqs); vhost_vsock_free(vsock); return 0; @@ -781,7 +803,7 @@ static int vhost_vsock_set_cid(struct vhost_vsock *vsoc= k, u64 guest_cid) =20 /* Refuse if CID is already in use */ mutex_lock(&vhost_vsock_mutex); - other =3D vhost_vsock_get(guest_cid); + other =3D vhost_vsock_get(guest_cid, vsock->net, vsock->net_mode); if (other && other !=3D vsock) { mutex_unlock(&vhost_vsock_mutex); return -EADDRINUSE; --=20 2.47.3