From: Bobby Eshleman <bobbyeshleman@meta.com>
Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
is operational. Additionally, reject G2H transport registration if there
already exists a namespace in local mode.
G2H sockets break in local mode because the G2H transports don't support
namespacing yet. The current approach is to coerce packets coming out of
G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
to coerce sockets in the same way because it cannot be deduced which
transport will be used by the socket. Specifically, when bound to
VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
until a packet is received and matched to the bound socket that we
assign the transport. This presents a chicken-and-egg problem, because
we need the namespace to lookup the socket and resolve the transport,
but we need the transport to know how to use the namespace during
lookup.
For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
used on systems that support G2H, even nested systems that also have H2G
transports.
Local mode is blocked based on detecting the presence of G2H devices
(when possible, as hyperv is special). This means that a host kernel
with G2H support compiled in (or has the module loaded), will still
support local mode if there is no G2H (e.g., virtio-vsock) device
detected. This enables using the same kernel in the host and in the
guest, as we do in kselftest.
Systems with only namespace-aware transports (vhost-vsock, loopback) can
still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
intended.
Add supports_local_mode() transport callback to indicate
transport-specific local mode support.
These restrictions can be lifted in a future patch series when G2H
transports gain namespace support.
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v11:
- vhost_transport_supports_local_mode() returns false to keep things
disabled until support comes online (Stefano)
- add comment above supports_local_mode() cb to clarify (Stefano)
- Remove redundant `ret = 0` initialization in vsock_net_mode_string()
(Stefano)
- Refactor vsock_net_mode_string() to separate parsing from validation
(Stefano)
- vmci returns false for supports_local_mode(), with comment
Changes in v10:
- move this patch before any transports bring online namespacing (Stefano)
- move vsock_net_mode_string into critical section (Stefano)
- add ->supports_local_mode() callback to transports (Stefano)
---
drivers/vhost/vsock.c | 6 ++++++
include/net/af_vsock.h | 11 +++++++++++
net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++
net/vmw_vsock/hyperv_transport.c | 6 ++++++
net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
net/vmw_vsock/vmci_transport.c | 12 ++++++++++++
net/vmw_vsock/vsock_loopback.c | 6 ++++++
7 files changed, 86 insertions(+)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 69074656263d..4e3856aa2479 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void)
return VHOST_VSOCK_DEFAULT_HOST_CID;
}
+static bool vhost_transport_supports_local_mode(void)
+{
+ return false;
+}
+
/* Callers that dereference the return value must hold vhost_vsock_mutex or the
* RCU read lock.
*/
@@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = {
.module = THIS_MODULE,
.get_local_cid = vhost_transport_get_local_cid,
+ .supports_local_mode = vhost_transport_supports_local_mode,
.init = virtio_transport_do_socket_init,
.destruct = virtio_transport_destruct,
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 59d97a143204..e24ef1d9fe02 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -180,6 +180,17 @@ struct vsock_transport {
/* Addressing. */
u32 (*get_local_cid)(void);
+ /* Return true if the transport is compatible with
+ * VSOCK_NET_MODE_LOCAL. Otherwise, return false.
+ *
+ * Transports should return false if they lack local-mode namespace
+ * support (e.g., G2H transports like hyperv-vsock and vmci-vsock).
+ * virtio-vsock returns true only if no device is present in order to
+ * enable local mode in nested scenarios in which virtio-vsock is
+ * loaded or built-in, but nonetheless unusable by sockets.
+ */
+ bool (*supports_local_mode)(void);
+
/* Read a single skb */
int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 243c0d588682..120adb9dad9f 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -91,6 +91,12 @@
* and locked down by a namespace manager. The default is "global". The mode
* is set per-namespace.
*
+ * Note: LOCAL mode is only supported when using namespace-aware transports
+ * (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
+ * hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
+ * fail with EOPNOTSUPP, as these transports do not support per-namespace
+ * isolation.
+ *
* The modes affect the allocation and accessibility of CIDs as follows:
*
* - global - access and allocation are all system-wide
@@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
else
return -EINVAL;
+ mutex_lock(&vsock_register_mutex);
+ if (mode == VSOCK_NET_MODE_LOCAL &&
+ transport_g2h && transport_g2h->supports_local_mode &&
+ !transport_g2h->supports_local_mode()) {
+ mutex_unlock(&vsock_register_mutex);
+ return -EOPNOTSUPP;
+ }
+ mutex_unlock(&vsock_register_mutex);
+
if (!vsock_net_write_mode(net, mode))
return -EPERM;
@@ -2938,6 +2953,7 @@ int vsock_core_register(const struct vsock_transport *t, int features)
{
const struct vsock_transport *t_h2g, *t_g2h, *t_dgram, *t_local;
int err = mutex_lock_interruptible(&vsock_register_mutex);
+ struct net *net;
if (err)
return err;
@@ -2960,6 +2976,22 @@ int vsock_core_register(const struct vsock_transport *t, int features)
err = -EBUSY;
goto err_busy;
}
+
+ /* G2H sockets break in LOCAL mode namespaces because G2H
+ * transports don't support them yet. Block registering new G2H
+ * transports if we already have local mode namespaces on the
+ * system.
+ */
+ rcu_read_lock();
+ for_each_net_rcu(net) {
+ if (vsock_net_mode(net) == VSOCK_NET_MODE_LOCAL) {
+ rcu_read_unlock();
+ err = -EOPNOTSUPP;
+ goto err_busy;
+ }
+ }
+ rcu_read_unlock();
+
t_g2h = t;
}
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 432fcbbd14d4..279f04fcd81a 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -833,10 +833,16 @@ int hvs_notify_set_rcvlowat(struct vsock_sock *vsk, int val)
return -EOPNOTSUPP;
}
+static bool hvs_supports_local_mode(void)
+{
+ return false;
+}
+
static struct vsock_transport hvs_transport = {
.module = THIS_MODULE,
.get_local_cid = hvs_get_local_cid,
+ .supports_local_mode = hvs_supports_local_mode,
.init = hvs_sock_init,
.destruct = hvs_destruct,
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index d365a4b371d0..af4fbce0baab 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -94,6 +94,18 @@ static u32 virtio_transport_get_local_cid(void)
return ret;
}
+static bool virtio_transport_supports_local_mode(void)
+{
+ struct virtio_vsock *vsock;
+
+ rcu_read_lock();
+ vsock = rcu_dereference(the_virtio_vsock);
+ rcu_read_unlock();
+
+ /* Local mode is supported only when no G2H device is present. */
+ return vsock ? false : true;
+}
+
/* Caller need to hold vsock->tx_lock on vq */
static int virtio_transport_send_skb(struct sk_buff *skb, struct virtqueue *vq,
struct virtio_vsock *vsock, gfp_t gfp)
@@ -544,6 +556,7 @@ static struct virtio_transport virtio_transport = {
.module = THIS_MODULE,
.get_local_cid = virtio_transport_get_local_cid,
+ .supports_local_mode = virtio_transport_supports_local_mode,
.init = virtio_transport_do_socket_init,
.destruct = virtio_transport_destruct,
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 7eccd6708d66..e392d3d1fd90 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -2033,6 +2033,17 @@ static u32 vmci_transport_get_local_cid(void)
return vmci_get_context_id();
}
+static bool vmci_transport_supports_local_mode(void)
+{
+ /* Local mode is not yet compatible with vmci because there is no clear
+ * mechanism yet for attaching a namespace to a VM, or for handling the
+ * namespacing for when neither H2G or G2H is registered (as is the
+ * case for MODULE_ALIAS_NETPROTO(PF_VSOCK) loading. To simplify, we
+ * keep local mode off for now.
+ */
+ return false;
+}
+
static struct vsock_transport vmci_transport = {
.module = THIS_MODULE,
.init = vmci_transport_socket_init,
@@ -2062,6 +2073,7 @@ static struct vsock_transport vmci_transport = {
.notify_send_post_enqueue = vmci_transport_notify_send_post_enqueue,
.shutdown = vmci_transport_shutdown,
.get_local_cid = vmci_transport_get_local_cid,
+ .supports_local_mode = vmci_transport_supports_local_mode,
};
static bool vmci_check_transport(struct vsock_sock *vsk)
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 8722337a4f80..1e25c1a6b43f 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -26,6 +26,11 @@ static u32 vsock_loopback_get_local_cid(void)
return VMADDR_CID_LOCAL;
}
+static bool vsock_loopback_supports_local_mode(void)
+{
+ return true;
+}
+
static int vsock_loopback_send_pkt(struct sk_buff *skb)
{
struct vsock_loopback *vsock = &the_vsock_loopback;
@@ -58,6 +63,7 @@ static struct virtio_transport loopback_transport = {
.module = THIS_MODULE,
.get_local_cid = vsock_loopback_get_local_cid,
+ .supports_local_mode = vsock_loopback_supports_local_mode,
.init = virtio_transport_do_socket_init,
.destruct = virtio_transport_destruct,
--
2.47.3
On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote:
>From: Bobby Eshleman <bobbyeshleman@meta.com>
>
>Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
>is operational. Additionally, reject G2H transport registration if there
>already exists a namespace in local mode.
>
>G2H sockets break in local mode because the G2H transports don't support
>namespacing yet. The current approach is to coerce packets coming out of
>G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
>to coerce sockets in the same way because it cannot be deduced which
>transport will be used by the socket. Specifically, when bound to
>VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
>until a packet is received and matched to the bound socket that we
>assign the transport. This presents a chicken-and-egg problem, because
>we need the namespace to lookup the socket and resolve the transport,
>but we need the transport to know how to use the namespace during
>lookup.
>
>For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
>used on systems that support G2H, even nested systems that also have H2G
>transports.
>
>Local mode is blocked based on detecting the presence of G2H devices
>(when possible, as hyperv is special). This means that a host kernel
>with G2H support compiled in (or has the module loaded), will still
>support local mode if there is no G2H (e.g., virtio-vsock) device
>detected. This enables using the same kernel in the host and in the
>guest, as we do in kselftest.
>
>Systems with only namespace-aware transports (vhost-vsock, loopback) can
>still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
>intended.
>
>Add supports_local_mode() transport callback to indicate
>transport-specific local mode support.
>
>These restrictions can be lifted in a future patch series when G2H
>transports gain namespace support.
>
>Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
>---
>Changes in v11:
>- vhost_transport_supports_local_mode() returns false to keep things
> disabled until support comes online (Stefano)
>- add comment above supports_local_mode() cb to clarify (Stefano)
>- Remove redundant `ret = 0` initialization in vsock_net_mode_string()
> (Stefano)
>- Refactor vsock_net_mode_string() to separate parsing from validation
> (Stefano)
>- vmci returns false for supports_local_mode(), with comment
>
>Changes in v10:
>- move this patch before any transports bring online namespacing (Stefano)
>- move vsock_net_mode_string into critical section (Stefano)
>- add ->supports_local_mode() callback to transports (Stefano)
>---
> drivers/vhost/vsock.c | 6 ++++++
> include/net/af_vsock.h | 11 +++++++++++
> net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++
> net/vmw_vsock/hyperv_transport.c | 6 ++++++
> net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
> net/vmw_vsock/vmci_transport.c | 12 ++++++++++++
> net/vmw_vsock/vsock_loopback.c | 6 ++++++
> 7 files changed, 86 insertions(+)
>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index 69074656263d..4e3856aa2479 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void)
> return VHOST_VSOCK_DEFAULT_HOST_CID;
> }
>
>+static bool vhost_transport_supports_local_mode(void)
>+{
>+ return false;
>+}
>+
> /* Callers that dereference the return value must hold vhost_vsock_mutex or the
> * RCU read lock.
> */
>@@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = {
> .module = THIS_MODULE,
>
> .get_local_cid = vhost_transport_get_local_cid,
>+ .supports_local_mode = vhost_transport_supports_local_mode,
>
> .init = virtio_transport_do_socket_init,
> .destruct = virtio_transport_destruct,
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index 59d97a143204..e24ef1d9fe02 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -180,6 +180,17 @@ struct vsock_transport {
> /* Addressing. */
> u32 (*get_local_cid)(void);
>
>+ /* Return true if the transport is compatible with
>+ * VSOCK_NET_MODE_LOCAL. Otherwise, return false.
>+ *
>+ * Transports should return false if they lack local-mode namespace
>+ * support (e.g., G2H transports like hyperv-vsock and vmci-vsock).
>+ * virtio-vsock returns true only if no device is present in order to
>+ * enable local mode in nested scenarios in which virtio-vsock is
>+ * loaded or built-in, but nonetheless unusable by sockets.
>+ */
>+ bool (*supports_local_mode)(void);
>+
> /* Read a single skb */
> int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 243c0d588682..120adb9dad9f 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -91,6 +91,12 @@
> * and locked down by a namespace manager. The default is "global". The mode
> * is set per-namespace.
> *
>+ * Note: LOCAL mode is only supported when using namespace-aware transports
>+ * (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
>+ * hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
>+ * fail with EOPNOTSUPP, as these transports do not support per-namespace
>+ * isolation.
>+ *
> * The modes affect the allocation and accessibility of CIDs as follows:
> *
> * - global - access and allocation are all system-wide
>@@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
> else
> return -EINVAL;
>
>+ mutex_lock(&vsock_register_mutex);
>+ if (mode == VSOCK_NET_MODE_LOCAL &&
>+ transport_g2h && transport_g2h->supports_local_mode &&
>+ !transport_g2h->supports_local_mode()) {
>+ mutex_unlock(&vsock_register_mutex);
>+ return -EOPNOTSUPP;
>+ }
>+ mutex_unlock(&vsock_register_mutex);
Wait, I think we already discussed about this, vsock_net_write_mode()
must be called with the lock held.
See
https://lore.kernel.org/netdev/aRTTwuuXSz5CvNjt@devvm11784.nha0.facebook.com/
Since I guess we need another version of this patch, can you check the
commit description to see if it reflects what we are doing now
(e.g vhost is not enabled)?
Also I don't understand why for vhost we will enable it later, but for
virtio_transport and vsock_loopback we are enabling it now, also if this
patch is before the support on that transports. I'm a bit confused.
If something is unclear, let's discuss it before sending a new version.
What I had in mind was, add this patch and explain why we need this new
callback (like you did), but enable the support in the patches that
really enable it for any transport. But maybe what is not clear to me is
that we need this only for G2H. But now I'm confused about the
discussion around vmci H2G. We decided to discard also that one, but
here we are not checking that?
I mean here we are calling supports_local_mode() only on G2H IIUC.
>+
> if (!vsock_net_write_mode(net, mode))
> return -EPERM;
>
>@@ -2938,6 +2953,7 @@ int vsock_core_register(const struct vsock_transport *t, int features)
> {
> const struct vsock_transport *t_h2g, *t_g2h, *t_dgram, *t_local;
> int err = mutex_lock_interruptible(&vsock_register_mutex);
>+ struct net *net;
>
> if (err)
> return err;
>@@ -2960,6 +2976,22 @@ int vsock_core_register(const struct vsock_transport *t, int features)
> err = -EBUSY;
> goto err_busy;
> }
>+
>+ /* G2H sockets break in LOCAL mode namespaces because G2H
And also here we are talking about only of G2H, so what happen if vmci
is loaded as H2G?
IMO we should discuss this a bit more and make it a bit more generic,
like check all the transports.
Thanks,
Stefano
>+ * transports don't support them yet. Block registering new G2H
>+ * transports if we already have local mode namespaces on the
>+ * system.
>+ */
>+ rcu_read_lock();
>+ for_each_net_rcu(net) {
>+ if (vsock_net_mode(net) == VSOCK_NET_MODE_LOCAL) {
>+ rcu_read_unlock();
>+ err = -EOPNOTSUPP;
>+ goto err_busy;
>+ }
>+ }
>+ rcu_read_unlock();
>+
> t_g2h = t;
> }
>
>diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
>index 432fcbbd14d4..279f04fcd81a 100644
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -833,10 +833,16 @@ int hvs_notify_set_rcvlowat(struct vsock_sock *vsk, int val)
> return -EOPNOTSUPP;
> }
>
>+static bool hvs_supports_local_mode(void)
>+{
>+ return false;
>+}
>+
> static struct vsock_transport hvs_transport = {
> .module = THIS_MODULE,
>
> .get_local_cid = hvs_get_local_cid,
>+ .supports_local_mode = hvs_supports_local_mode,
>
> .init = hvs_sock_init,
> .destruct = hvs_destruct,
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index d365a4b371d0..af4fbce0baab 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -94,6 +94,18 @@ static u32 virtio_transport_get_local_cid(void)
> return ret;
> }
>
>+static bool virtio_transport_supports_local_mode(void)
>+{
>+ struct virtio_vsock *vsock;
>+
>+ rcu_read_lock();
>+ vsock = rcu_dereference(the_virtio_vsock);
>+ rcu_read_unlock();
>+
>+ /* Local mode is supported only when no G2H device is present. */
>+ return vsock ? false : true;
>+}
>+
> /* Caller need to hold vsock->tx_lock on vq */
> static int virtio_transport_send_skb(struct sk_buff *skb, struct virtqueue *vq,
> struct virtio_vsock *vsock, gfp_t gfp)
>@@ -544,6 +556,7 @@ static struct virtio_transport virtio_transport = {
> .module = THIS_MODULE,
>
> .get_local_cid = virtio_transport_get_local_cid,
>+ .supports_local_mode = virtio_transport_supports_local_mode,
>
> .init = virtio_transport_do_socket_init,
> .destruct = virtio_transport_destruct,
>diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>index 7eccd6708d66..e392d3d1fd90 100644
>--- a/net/vmw_vsock/vmci_transport.c
>+++ b/net/vmw_vsock/vmci_transport.c
>@@ -2033,6 +2033,17 @@ static u32 vmci_transport_get_local_cid(void)
> return vmci_get_context_id();
> }
>
>+static bool vmci_transport_supports_local_mode(void)
>+{
>+ /* Local mode is not yet compatible with vmci because there is no clear
>+ * mechanism yet for attaching a namespace to a VM, or for handling the
>+ * namespacing for when neither H2G or G2H is registered (as is the
>+ * case for MODULE_ALIAS_NETPROTO(PF_VSOCK) loading. To simplify, we
>+ * keep local mode off for now.
>+ */
>+ return false;
>+}
>+
> static struct vsock_transport vmci_transport = {
> .module = THIS_MODULE,
> .init = vmci_transport_socket_init,
>@@ -2062,6 +2073,7 @@ static struct vsock_transport vmci_transport = {
> .notify_send_post_enqueue = vmci_transport_notify_send_post_enqueue,
> .shutdown = vmci_transport_shutdown,
> .get_local_cid = vmci_transport_get_local_cid,
>+ .supports_local_mode = vmci_transport_supports_local_mode,
> };
>
> static bool vmci_check_transport(struct vsock_sock *vsk)
>diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
>index 8722337a4f80..1e25c1a6b43f 100644
>--- a/net/vmw_vsock/vsock_loopback.c
>+++ b/net/vmw_vsock/vsock_loopback.c
>@@ -26,6 +26,11 @@ static u32 vsock_loopback_get_local_cid(void)
> return VMADDR_CID_LOCAL;
> }
>
>+static bool vsock_loopback_supports_local_mode(void)
>+{
>+ return true;
>+}
>+
> static int vsock_loopback_send_pkt(struct sk_buff *skb)
> {
> struct vsock_loopback *vsock = &the_vsock_loopback;
>@@ -58,6 +63,7 @@ static struct virtio_transport loopback_transport = {
> .module = THIS_MODULE,
>
> .get_local_cid = vsock_loopback_get_local_cid,
>+ .supports_local_mode = vsock_loopback_supports_local_mode,
>
> .init = virtio_transport_do_socket_init,
> .destruct = virtio_transport_destruct,
>
>--
>2.47.3
>
On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote:
> On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote:
> > From: Bobby Eshleman <bobbyeshleman@meta.com>
> >
> > Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
> > is operational. Additionally, reject G2H transport registration if there
> > already exists a namespace in local mode.
> >
> > G2H sockets break in local mode because the G2H transports don't support
> > namespacing yet. The current approach is to coerce packets coming out of
> > G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
> > to coerce sockets in the same way because it cannot be deduced which
> > transport will be used by the socket. Specifically, when bound to
> > VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
> > until a packet is received and matched to the bound socket that we
> > assign the transport. This presents a chicken-and-egg problem, because
> > we need the namespace to lookup the socket and resolve the transport,
> > but we need the transport to know how to use the namespace during
> > lookup.
> >
> > For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
> > used on systems that support G2H, even nested systems that also have H2G
> > transports.
> >
> > Local mode is blocked based on detecting the presence of G2H devices
> > (when possible, as hyperv is special). This means that a host kernel
> > with G2H support compiled in (or has the module loaded), will still
> > support local mode if there is no G2H (e.g., virtio-vsock) device
> > detected. This enables using the same kernel in the host and in the
> > guest, as we do in kselftest.
> >
> > Systems with only namespace-aware transports (vhost-vsock, loopback) can
> > still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
> > intended.
> >
> > Add supports_local_mode() transport callback to indicate
> > transport-specific local mode support.
> >
> > These restrictions can be lifted in a future patch series when G2H
> > transports gain namespace support.
> >
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > ---
> > Changes in v11:
> > - vhost_transport_supports_local_mode() returns false to keep things
> > disabled until support comes online (Stefano)
> > - add comment above supports_local_mode() cb to clarify (Stefano)
> > - Remove redundant `ret = 0` initialization in vsock_net_mode_string()
> > (Stefano)
> > - Refactor vsock_net_mode_string() to separate parsing from validation
> > (Stefano)
> > - vmci returns false for supports_local_mode(), with comment
> >
> > Changes in v10:
> > - move this patch before any transports bring online namespacing (Stefano)
> > - move vsock_net_mode_string into critical section (Stefano)
> > - add ->supports_local_mode() callback to transports (Stefano)
> > ---
> > drivers/vhost/vsock.c | 6 ++++++
> > include/net/af_vsock.h | 11 +++++++++++
> > net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++
> > net/vmw_vsock/hyperv_transport.c | 6 ++++++
> > net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
> > net/vmw_vsock/vmci_transport.c | 12 ++++++++++++
> > net/vmw_vsock/vsock_loopback.c | 6 ++++++
> > 7 files changed, 86 insertions(+)
> >
> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > index 69074656263d..4e3856aa2479 100644
> > --- a/drivers/vhost/vsock.c
> > +++ b/drivers/vhost/vsock.c
> > @@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void)
> > return VHOST_VSOCK_DEFAULT_HOST_CID;
> > }
> >
> > +static bool vhost_transport_supports_local_mode(void)
> > +{
> > + return false;
> > +}
> > +
> > /* Callers that dereference the return value must hold vhost_vsock_mutex or the
> > * RCU read lock.
> > */
> > @@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = {
> > .module = THIS_MODULE,
> >
> > .get_local_cid = vhost_transport_get_local_cid,
> > + .supports_local_mode = vhost_transport_supports_local_mode,
> >
> > .init = virtio_transport_do_socket_init,
> > .destruct = virtio_transport_destruct,
> > diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
> > index 59d97a143204..e24ef1d9fe02 100644
> > --- a/include/net/af_vsock.h
> > +++ b/include/net/af_vsock.h
> > @@ -180,6 +180,17 @@ struct vsock_transport {
> > /* Addressing. */
> > u32 (*get_local_cid)(void);
> >
> > + /* Return true if the transport is compatible with
> > + * VSOCK_NET_MODE_LOCAL. Otherwise, return false.
> > + *
> > + * Transports should return false if they lack local-mode namespace
> > + * support (e.g., G2H transports like hyperv-vsock and vmci-vsock).
> > + * virtio-vsock returns true only if no device is present in order to
> > + * enable local mode in nested scenarios in which virtio-vsock is
> > + * loaded or built-in, but nonetheless unusable by sockets.
> > + */
> > + bool (*supports_local_mode)(void);
> > +
> > /* Read a single skb */
> > int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
> >
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index 243c0d588682..120adb9dad9f 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -91,6 +91,12 @@
> > * and locked down by a namespace manager. The default is "global". The mode
> > * is set per-namespace.
> > *
> > + * Note: LOCAL mode is only supported when using namespace-aware transports
> > + * (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
> > + * hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
> > + * fail with EOPNOTSUPP, as these transports do not support per-namespace
> > + * isolation.
> > + *
> > * The modes affect the allocation and accessibility of CIDs as follows:
> > *
> > * - global - access and allocation are all system-wide
> > @@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
> > else
> > return -EINVAL;
> >
> > + mutex_lock(&vsock_register_mutex);
> > + if (mode == VSOCK_NET_MODE_LOCAL &&
> > + transport_g2h && transport_g2h->supports_local_mode &&
> > + !transport_g2h->supports_local_mode()) {
> > + mutex_unlock(&vsock_register_mutex);
> > + return -EOPNOTSUPP;
> > + }
> > + mutex_unlock(&vsock_register_mutex);
>
> Wait, I think we already discussed about this, vsock_net_write_mode() must
> be called with the lock held.
>
> See
> https://lore.kernel.org/netdev/aRTTwuuXSz5CvNjt@devvm11784.nha0.facebook.com/
>
Ah right, oversight on my part.
> Since I guess we need another version of this patch, can you check the
> commit description to see if it reflects what we are doing now
> (e.g vhost is not enabled)?
>
> Also I don't understand why for vhost we will enable it later, but for
> virtio_transport and vsock_loopback we are enabling it now, also if this
> patch is before the support on that transports. I'm a bit confused.
>
> If something is unclear, let's discuss it before sending a new version.
>
>
> What I had in mind was, add this patch and explain why we need this new
> callback (like you did), but enable the support in the patches that
> really enable it for any transport. But maybe what is not clear to me is
> that we need this only for G2H. But now I'm confused about the discussion
> around vmci H2G. We decided to discard also that one, but here we are not
> checking that?
> I mean here we are calling supports_local_mode() only on G2H IIUC.
Ah right, VMCI broke my original mental model of only needing this check
for G2H (originally I didn't realize VMCI was H2G too).
I think now, we actually need to do this check for all of the transports
no? Including h2g, g2h, local, and dgram?
Additionally, the commit description needs to be updated to reflect that.
With this, we then end up with two commits:
commit 1) This commit which adds the callbacks and gives each
transport stubs to return false. Checks all transports (not just
G2H). Update the commit. Fix vsock_net_write_mode() race above.
commit 2) change the virtio-vsock/vhost-vsock/vsock-loopback to
add the real implementations (vhost + loopback return true,
virtio detects device). The other transports keep their return
false stubs so no changes.
Does that seem about right?
>
> > +
> > if (!vsock_net_write_mode(net, mode))
> > return -EPERM;
> >
> > @@ -2938,6 +2953,7 @@ int vsock_core_register(const struct vsock_transport *t, int features)
> > {
> > const struct vsock_transport *t_h2g, *t_g2h, *t_dgram, *t_local;
> > int err = mutex_lock_interruptible(&vsock_register_mutex);
> > + struct net *net;
> >
> > if (err)
> > return err;
> > @@ -2960,6 +2976,22 @@ int vsock_core_register(const struct vsock_transport *t, int features)
> > err = -EBUSY;
> > goto err_busy;
> > }
> > +
> > + /* G2H sockets break in LOCAL mode namespaces because G2H
>
> And also here we are talking about only of G2H, so what happen if vmci is
> loaded as H2G?
>
> IMO we should discuss this a bit more and make it a bit more generic, like
> check all the transports.
>
> Thanks,
> Stefano
Agreed.
Best,
Bobby
On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote:
>On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote:
>> On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote:
>> > From: Bobby Eshleman <bobbyeshleman@meta.com>
>> >
>> > Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
>> > is operational. Additionally, reject G2H transport registration if there
>> > already exists a namespace in local mode.
>> >
>> > G2H sockets break in local mode because the G2H transports don't support
>> > namespacing yet. The current approach is to coerce packets coming out of
>> > G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
>> > to coerce sockets in the same way because it cannot be deduced which
>> > transport will be used by the socket. Specifically, when bound to
>> > VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
>> > until a packet is received and matched to the bound socket that we
>> > assign the transport. This presents a chicken-and-egg problem, because
>> > we need the namespace to lookup the socket and resolve the transport,
>> > but we need the transport to know how to use the namespace during
>> > lookup.
>> >
>> > For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
>> > used on systems that support G2H, even nested systems that also have H2G
>> > transports.
>> >
>> > Local mode is blocked based on detecting the presence of G2H devices
>> > (when possible, as hyperv is special). This means that a host kernel
>> > with G2H support compiled in (or has the module loaded), will still
>> > support local mode if there is no G2H (e.g., virtio-vsock) device
>> > detected. This enables using the same kernel in the host and in the
>> > guest, as we do in kselftest.
>> >
>> > Systems with only namespace-aware transports (vhost-vsock, loopback) can
>> > still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
>> > intended.
>> >
>> > Add supports_local_mode() transport callback to indicate
>> > transport-specific local mode support.
>> >
>> > These restrictions can be lifted in a future patch series when G2H
>> > transports gain namespace support.
>> >
>> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
>> > ---
>> > Changes in v11:
>> > - vhost_transport_supports_local_mode() returns false to keep things
>> > disabled until support comes online (Stefano)
>> > - add comment above supports_local_mode() cb to clarify (Stefano)
>> > - Remove redundant `ret = 0` initialization in vsock_net_mode_string()
>> > (Stefano)
>> > - Refactor vsock_net_mode_string() to separate parsing from validation
>> > (Stefano)
>> > - vmci returns false for supports_local_mode(), with comment
>> >
>> > Changes in v10:
>> > - move this patch before any transports bring online namespacing (Stefano)
>> > - move vsock_net_mode_string into critical section (Stefano)
>> > - add ->supports_local_mode() callback to transports (Stefano)
>> > ---
>> > drivers/vhost/vsock.c | 6 ++++++
>> > include/net/af_vsock.h | 11 +++++++++++
>> > net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++
>> > net/vmw_vsock/hyperv_transport.c | 6 ++++++
>> > net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
>> > net/vmw_vsock/vmci_transport.c | 12 ++++++++++++
>> > net/vmw_vsock/vsock_loopback.c | 6 ++++++
>> > 7 files changed, 86 insertions(+)
>> >
>> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> > index 69074656263d..4e3856aa2479 100644
>> > --- a/drivers/vhost/vsock.c
>> > +++ b/drivers/vhost/vsock.c
>> > @@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void)
>> > return VHOST_VSOCK_DEFAULT_HOST_CID;
>> > }
>> >
>> > +static bool vhost_transport_supports_local_mode(void)
>> > +{
>> > + return false;
>> > +}
>> > +
>> > /* Callers that dereference the return value must hold vhost_vsock_mutex or the
>> > * RCU read lock.
>> > */
>> > @@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = {
>> > .module = THIS_MODULE,
>> >
>> > .get_local_cid = vhost_transport_get_local_cid,
>> > + .supports_local_mode = vhost_transport_supports_local_mode,
>> >
>> > .init = virtio_transport_do_socket_init,
>> > .destruct = virtio_transport_destruct,
>> > diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>> > index 59d97a143204..e24ef1d9fe02 100644
>> > --- a/include/net/af_vsock.h
>> > +++ b/include/net/af_vsock.h
>> > @@ -180,6 +180,17 @@ struct vsock_transport {
>> > /* Addressing. */
>> > u32 (*get_local_cid)(void);
>> >
>> > + /* Return true if the transport is compatible with
>> > + * VSOCK_NET_MODE_LOCAL. Otherwise, return false.
>> > + *
>> > + * Transports should return false if they lack local-mode namespace
>> > + * support (e.g., G2H transports like hyperv-vsock and vmci-vsock).
>> > + * virtio-vsock returns true only if no device is present in order to
>> > + * enable local mode in nested scenarios in which virtio-vsock is
>> > + * loaded or built-in, but nonetheless unusable by sockets.
>> > + */
>> > + bool (*supports_local_mode)(void);
>> > +
>> > /* Read a single skb */
>> > int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
>> >
>> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> > index 243c0d588682..120adb9dad9f 100644
>> > --- a/net/vmw_vsock/af_vsock.c
>> > +++ b/net/vmw_vsock/af_vsock.c
>> > @@ -91,6 +91,12 @@
>> > * and locked down by a namespace manager. The default is "global". The mode
>> > * is set per-namespace.
>> > *
>> > + * Note: LOCAL mode is only supported when using namespace-aware transports
>> > + * (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
>> > + * hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
>> > + * fail with EOPNOTSUPP, as these transports do not support per-namespace
>> > + * isolation.
>> > + *
>> > * The modes affect the allocation and accessibility of CIDs as follows:
>> > *
>> > * - global - access and allocation are all system-wide
>> > @@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
>> > else
>> > return -EINVAL;
>> >
>> > + mutex_lock(&vsock_register_mutex);
>> > + if (mode == VSOCK_NET_MODE_LOCAL &&
>> > + transport_g2h && transport_g2h->supports_local_mode &&
>> > + !transport_g2h->supports_local_mode()) {
>> > + mutex_unlock(&vsock_register_mutex);
>> > + return -EOPNOTSUPP;
>> > + }
>> > + mutex_unlock(&vsock_register_mutex);
>>
>> Wait, I think we already discussed about this, vsock_net_write_mode() must
>> be called with the lock held.
>>
>> See
>> https://lore.kernel.org/netdev/aRTTwuuXSz5CvNjt@devvm11784.nha0.facebook.com/
>>
>
>Ah right, oversight on my part.
>
>> Since I guess we need another version of this patch, can you check the
>> commit description to see if it reflects what we are doing now
>> (e.g vhost is not enabled)?
>>
>> Also I don't understand why for vhost we will enable it later, but for
>> virtio_transport and vsock_loopback we are enabling it now, also if this
>> patch is before the support on that transports. I'm a bit confused.
>>
>> If something is unclear, let's discuss it before sending a new version.
>>
>>
>> What I had in mind was, add this patch and explain why we need this new
>> callback (like you did), but enable the support in the patches that
>> really enable it for any transport. But maybe what is not clear to me is
>> that we need this only for G2H. But now I'm confused about the discussion
>> around vmci H2G. We decided to discard also that one, but here we are not
>> checking that?
>> I mean here we are calling supports_local_mode() only on G2H IIUC.
>
>Ah right, VMCI broke my original mental model of only needing this check
>for G2H (originally I didn't realize VMCI was H2G too).
>
>I think now, we actually need to do this check for all of the transports
>no? Including h2g, g2h, local, and dgram?
>
>Additionally, the commit description needs to be updated to reflect that.
Let's take a step back, though, because I tried to understand the
problem better and I'm confused.
For example, in vmci (G2H side), when a packet arrives, we always use
vsock_find_connected_socket(), which only searches in GLOBAL. So
connections originating from the host can only reach global sockets in
the guest. In this direction (host -> guest), we should be fine, right?
Now let's consider the other direction, from guest to host, so the
connection should be generated via vsock_connect().
Here I see that we are not doing anything with regard to the source
namespace. At this point, my question is whether we should modify
vsock_assign_transport() or transport->stream_allow() to do this for
each stream, and not prevent loading a G2H module a priori.
For example, stream_allow() could check that the socket namespace is
supported by the assigned transport. E.g., vmci can check that if the
namespace mode is not GLOBAL, then it returns false. (Same thing in
virtio-vsock, I mean the G2H driver).
This should solve the guest -> host direction, but at this point I
wonder if I'm missing something.
>
>With this, we then end up with two commits:
>
> commit 1) This commit which adds the callbacks and gives each
> transport stubs to return false. Checks all transports (not just
> G2H). Update the commit. Fix vsock_net_write_mode() race above.
>
> commit 2) change the virtio-vsock/vhost-vsock/vsock-loopback to
> add the real implementations (vhost + loopback return true,
> virtio detects device). The other transports keep their return
> false stubs so no changes.
>
>Does that seem about right?
If we really need this approach, this should be fine.
Thanks,
Stefano
On Mon, Nov 24, 2025 at 11:10:19AM +0100, Stefano Garzarella wrote:
> On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote:
> > On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote:
> > > On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote:
> > > > From: Bobby Eshleman <bobbyeshleman@meta.com>
> > > >
> > > > Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
> > > > is operational. Additionally, reject G2H transport registration if there
> > > > already exists a namespace in local mode.
> > > >
> > > > G2H sockets break in local mode because the G2H transports don't support
> > > > namespacing yet. The current approach is to coerce packets coming out of
> > > > G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
> > > > to coerce sockets in the same way because it cannot be deduced which
> > > > transport will be used by the socket. Specifically, when bound to
> > > > VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
> > > > until a packet is received and matched to the bound socket that we
> > > > assign the transport. This presents a chicken-and-egg problem, because
> > > > we need the namespace to lookup the socket and resolve the transport,
> > > > but we need the transport to know how to use the namespace during
> > > > lookup.
> > > >
> > > > For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
> > > > used on systems that support G2H, even nested systems that also have H2G
> > > > transports.
> > > >
> > > > Local mode is blocked based on detecting the presence of G2H devices
> > > > (when possible, as hyperv is special). This means that a host kernel
> > > > with G2H support compiled in (or has the module loaded), will still
> > > > support local mode if there is no G2H (e.g., virtio-vsock) device
> > > > detected. This enables using the same kernel in the host and in the
> > > > guest, as we do in kselftest.
> > > >
> > > > Systems with only namespace-aware transports (vhost-vsock, loopback) can
> > > > still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
> > > > intended.
> > > >
> > > > Add supports_local_mode() transport callback to indicate
> > > > transport-specific local mode support.
> > > >
> > > > These restrictions can be lifted in a future patch series when G2H
> > > > transports gain namespace support.
> > > >
> > > > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > > > ---
> > > > Changes in v11:
> > > > - vhost_transport_supports_local_mode() returns false to keep things
> > > > disabled until support comes online (Stefano)
> > > > - add comment above supports_local_mode() cb to clarify (Stefano)
> > > > - Remove redundant `ret = 0` initialization in vsock_net_mode_string()
> > > > (Stefano)
> > > > - Refactor vsock_net_mode_string() to separate parsing from validation
> > > > (Stefano)
> > > > - vmci returns false for supports_local_mode(), with comment
> > > >
> > > > Changes in v10:
> > > > - move this patch before any transports bring online namespacing (Stefano)
> > > > - move vsock_net_mode_string into critical section (Stefano)
> > > > - add ->supports_local_mode() callback to transports (Stefano)
> > > > ---
> > > > drivers/vhost/vsock.c | 6 ++++++
> > > > include/net/af_vsock.h | 11 +++++++++++
> > > > net/vmw_vsock/af_vsock.c | 32 ++++++++++++++++++++++++++++++++
> > > > net/vmw_vsock/hyperv_transport.c | 6 ++++++
> > > > net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
> > > > net/vmw_vsock/vmci_transport.c | 12 ++++++++++++
> > > > net/vmw_vsock/vsock_loopback.c | 6 ++++++
> > > > 7 files changed, 86 insertions(+)
> > > >
> > > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > > > index 69074656263d..4e3856aa2479 100644
> > > > --- a/drivers/vhost/vsock.c
> > > > +++ b/drivers/vhost/vsock.c
> > > > @@ -64,6 +64,11 @@ static u32 vhost_transport_get_local_cid(void)
> > > > return VHOST_VSOCK_DEFAULT_HOST_CID;
> > > > }
> > > >
> > > > +static bool vhost_transport_supports_local_mode(void)
> > > > +{
> > > > + return false;
> > > > +}
> > > > +
> > > > /* Callers that dereference the return value must hold vhost_vsock_mutex or the
> > > > * RCU read lock.
> > > > */
> > > > @@ -412,6 +417,7 @@ static struct virtio_transport vhost_transport = {
> > > > .module = THIS_MODULE,
> > > >
> > > > .get_local_cid = vhost_transport_get_local_cid,
> > > > + .supports_local_mode = vhost_transport_supports_local_mode,
> > > >
> > > > .init = virtio_transport_do_socket_init,
> > > > .destruct = virtio_transport_destruct,
> > > > diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
> > > > index 59d97a143204..e24ef1d9fe02 100644
> > > > --- a/include/net/af_vsock.h
> > > > +++ b/include/net/af_vsock.h
> > > > @@ -180,6 +180,17 @@ struct vsock_transport {
> > > > /* Addressing. */
> > > > u32 (*get_local_cid)(void);
> > > >
> > > > + /* Return true if the transport is compatible with
> > > > + * VSOCK_NET_MODE_LOCAL. Otherwise, return false.
> > > > + *
> > > > + * Transports should return false if they lack local-mode namespace
> > > > + * support (e.g., G2H transports like hyperv-vsock and vmci-vsock).
> > > > + * virtio-vsock returns true only if no device is present in order to
> > > > + * enable local mode in nested scenarios in which virtio-vsock is
> > > > + * loaded or built-in, but nonetheless unusable by sockets.
> > > > + */
> > > > + bool (*supports_local_mode)(void);
> > > > +
> > > > /* Read a single skb */
> > > > int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
> > > >
> > > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > > > index 243c0d588682..120adb9dad9f 100644
> > > > --- a/net/vmw_vsock/af_vsock.c
> > > > +++ b/net/vmw_vsock/af_vsock.c
> > > > @@ -91,6 +91,12 @@
> > > > * and locked down by a namespace manager. The default is "global". The mode
> > > > * is set per-namespace.
> > > > *
> > > > + * Note: LOCAL mode is only supported when using namespace-aware transports
> > > > + * (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
> > > > + * hyperv-vsock, vmci-vsock) is operational, attempts to set LOCAL mode will
> > > > + * fail with EOPNOTSUPP, as these transports do not support per-namespace
> > > > + * isolation.
> > > > + *
> > > > * The modes affect the allocation and accessibility of CIDs as follows:
> > > > *
> > > > * - global - access and allocation are all system-wide
> > > > @@ -2794,6 +2800,15 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
> > > > else
> > > > return -EINVAL;
> > > >
> > > > + mutex_lock(&vsock_register_mutex);
> > > > + if (mode == VSOCK_NET_MODE_LOCAL &&
> > > > + transport_g2h && transport_g2h->supports_local_mode &&
> > > > + !transport_g2h->supports_local_mode()) {
> > > > + mutex_unlock(&vsock_register_mutex);
> > > > + return -EOPNOTSUPP;
> > > > + }
> > > > + mutex_unlock(&vsock_register_mutex);
> > >
> > > Wait, I think we already discussed about this, vsock_net_write_mode() must
> > > be called with the lock held.
> > >
> > > See
> > > https://lore.kernel.org/netdev/aRTTwuuXSz5CvNjt@devvm11784.nha0.facebook.com/
> > >
> >
> > Ah right, oversight on my part.
> >
> > > Since I guess we need another version of this patch, can you check the
> > > commit description to see if it reflects what we are doing now
> > > (e.g vhost is not enabled)?
> > >
> > > Also I don't understand why for vhost we will enable it later, but for
> > > virtio_transport and vsock_loopback we are enabling it now, also if this
> > > patch is before the support on that transports. I'm a bit confused.
> > >
> > > If something is unclear, let's discuss it before sending a new version.
> > >
> > >
> > > What I had in mind was, add this patch and explain why we need this new
> > > callback (like you did), but enable the support in the patches that
> > > really enable it for any transport. But maybe what is not clear to me is
> > > that we need this only for G2H. But now I'm confused about the discussion
> > > around vmci H2G. We decided to discard also that one, but here we are not
> > > checking that?
> > > I mean here we are calling supports_local_mode() only on G2H IIUC.
> >
> > Ah right, VMCI broke my original mental model of only needing this check
> > for G2H (originally I didn't realize VMCI was H2G too).
> >
> > I think now, we actually need to do this check for all of the transports
> > no? Including h2g, g2h, local, and dgram?
> >
> > Additionally, the commit description needs to be updated to reflect that.
>
> Let's take a step back, though, because I tried to understand the problem
> better and I'm confused.
>
> For example, in vmci (G2H side), when a packet arrives, we always use
> vsock_find_connected_socket(), which only searches in GLOBAL. So connections
> originating from the host can only reach global sockets in the guest. In
> this direction (host -> guest), we should be fine, right?
>
> Now let's consider the other direction, from guest to host, so the
> connection should be generated via vsock_connect().
> Here I see that we are not doing anything with regard to the source
> namespace. At this point, my question is whether we should modify
> vsock_assign_transport() or transport->stream_allow() to do this for each
> stream, and not prevent loading a G2H module a priori.
>
> For example, stream_allow() could check that the socket namespace is
> supported by the assigned transport. E.g., vmci can check that if the
> namespace mode is not GLOBAL, then it returns false. (Same thing in
> virtio-vsock, I mean the G2H driver).
>
> This should solve the guest -> host direction, but at this point I wonder if
> I'm missing something.
For the G2H connect case that is true, but the situation gets a little
fuzzier on the G2H RX side w/ VMADDR_CID_ANY listeners.
Let's say we have a nested system w/ both virtio-vsock and vhost-vsock.
We have a listener in namespace local on VMADDR_CID_ANY. So far, no
transport is assigned, so we can't call t->stream_allow() yet.
virtio-vsock only knows of global mode, so its lookup will fail (unless
we hack in some special case to virtio_transport_recv_pkt() to scan
local namespaces). vhost-vsock will work as expected. Letting local mode
sockets be silently unreachable by virtio-vsock seems potentially
confusing for users. If the system were not nested, we can pre-resolve
VMADDR_CID_ANY in bind() and handle things upfront as well. Rejecting
local mode outright is just a broad guardrail.
If we're trying to find a less heavy-handed option, we might be able to
do the following:
- change bind(cid) w/ cid != VMADDR_CID_ANY to directly assign the transport
for all socket types (not just SOCK_DGRAM)
- vsock_assign_transport() can outright fail if !t->supports_local_mode()
and sock_net(sk) has mode local
- bind(VMADDR_CID_ANY) can maybe print (once) to dmesg a warning that
only the H2G transport will land on VMADDR_CID_ANY sockets.
I'm certainly open to other suggestions.
Best,
Bobby
On Mon, Nov 24, 2025 at 09:29:05AM -0800, Bobby Eshleman wrote: >On Mon, Nov 24, 2025 at 11:10:19AM +0100, Stefano Garzarella wrote: >> On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote: >> > On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote: >> > > On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote: [...] >> > >> > > Since I guess we need another version of this patch, can you check the >> > > commit description to see if it reflects what we are doing now >> > > (e.g vhost is not enabled)? >> > > >> > > Also I don't understand why for vhost we will enable it later, but for >> > > virtio_transport and vsock_loopback we are enabling it now, also if this >> > > patch is before the support on that transports. I'm a bit confused. >> > > >> > > If something is unclear, let's discuss it before sending a new version. >> > > >> > > >> > > What I had in mind was, add this patch and explain why we need this new >> > > callback (like you did), but enable the support in the patches that >> > > really enable it for any transport. But maybe what is not clear to me is >> > > that we need this only for G2H. But now I'm confused about the discussion >> > > around vmci H2G. We decided to discard also that one, but here we are not >> > > checking that? >> > > I mean here we are calling supports_local_mode() only on G2H IIUC. >> > >> > Ah right, VMCI broke my original mental model of only needing this check >> > for G2H (originally I didn't realize VMCI was H2G too). >> > >> > I think now, we actually need to do this check for all of the transports >> > no? Including h2g, g2h, local, and dgram? >> > >> > Additionally, the commit description needs to be updated to reflect that. >> >> Let's take a step back, though, because I tried to understand the problem >> better and I'm confused. >> >> For example, in vmci (G2H side), when a packet arrives, we always use >> vsock_find_connected_socket(), which only searches in GLOBAL. So connections >> originating from the host can only reach global sockets in the guest. In >> this direction (host -> guest), we should be fine, right? >> >> Now let's consider the other direction, from guest to host, so the >> connection should be generated via vsock_connect(). >> Here I see that we are not doing anything with regard to the source >> namespace. At this point, my question is whether we should modify >> vsock_assign_transport() or transport->stream_allow() to do this for each >> stream, and not prevent loading a G2H module a priori. >> >> For example, stream_allow() could check that the socket namespace is >> supported by the assigned transport. E.g., vmci can check that if the >> namespace mode is not GLOBAL, then it returns false. (Same thing in >> virtio-vsock, I mean the G2H driver). >> >> This should solve the guest -> host direction, but at this point I wonder if >> I'm missing something. > >For the G2H connect case that is true, but the situation gets a little >fuzzier on the G2H RX side w/ VMADDR_CID_ANY listeners. > >Let's say we have a nested system w/ both virtio-vsock and vhost-vsock. >We have a listener in namespace local on VMADDR_CID_ANY. So far, no >transport is assigned, so we can't call t->stream_allow() yet. >virtio-vsock only knows of global mode, so its lookup will fail (unless What is the problem of failing in this case? I mean, we are documenting that G2H will not be able to reach socket in namespaces with "local" mode. Old (and default) behaviour is still allowing them, right? I don't think it conflicts with the definition of “local” either, because these connections are coming from outside, and the user doesn't expect to be able to receive them in a “local” namespace, unless there is a way to put the device in the namespace (as with net). But this method doesn't exist yet, and by documenting it sufficiently, we can say that it will be supported in the future, but not for now. >we hack in some special case to virtio_transport_recv_pkt() to scan >local namespaces). vhost-vsock will work as expected. Letting local mode >sockets be silently unreachable by virtio-vsock seems potentially >confusing for users. If the system were not nested, we can pre-resolve >VMADDR_CID_ANY in bind() and handle things upfront as well. Rejecting >local mode outright is just a broad guardrail. Okay, but in that case, we are not supporting “local” mode too, but we are also preventing “global” from being used on these when we are in a nested environment. What is the advantage of this approach? > >If we're trying to find a less heavy-handed option, we might be able to >do the following: > >- change bind(cid) w/ cid != VMADDR_CID_ANY to directly assign the >transport > for all socket types (not just SOCK_DGRAM) That would be nice, but it wouldn't solve the problem with VMADDR_CID_ANY, which I guess is the use case in 99% of cases. > >- vsock_assign_transport() can outright fail if !t->supports_local_mode() > and sock_net(sk) has mode local But in this case, why not reusing stream_allow() ? > >- bind(VMADDR_CID_ANY) can maybe print (once) to dmesg a warning that > only the H2G transport will land on VMADDR_CID_ANY sockets. mmm, I'm not sure about that, we should ask net maintainer, but IMO documenting that in af_vsock.c and man pages should be fine, till G2H will support that. > >I'm certainly open to other suggestions. IMO we should avoid the failure when loading G2H, which is more confusing than just discard connection from the host to a "local" namespace. We should try at least to support the "global" namespace. Thanks, Stefano
On Mon, Nov 24, 2025 at 06:54:45PM +0100, Stefano Garzarella wrote: > On Mon, Nov 24, 2025 at 09:29:05AM -0800, Bobby Eshleman wrote: > > On Mon, Nov 24, 2025 at 11:10:19AM +0100, Stefano Garzarella wrote: > > > On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote: > > > > On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote: > > > > > On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote: > > [...] > > > > > > > > > > Since I guess we need another version of this patch, can you check the > > > > > commit description to see if it reflects what we are doing now > > > > > (e.g vhost is not enabled)? > > > > > > > > > > Also I don't understand why for vhost we will enable it later, but for > > > > > virtio_transport and vsock_loopback we are enabling it now, also if this > > > > > patch is before the support on that transports. I'm a bit confused. > > > > > > > > > > If something is unclear, let's discuss it before sending a new version. > > > > > > > > > > > > > > > What I had in mind was, add this patch and explain why we need this new > > > > > callback (like you did), but enable the support in the patches that > > > > > really enable it for any transport. But maybe what is not clear to me is > > > > > that we need this only for G2H. But now I'm confused about the discussion > > > > > around vmci H2G. We decided to discard also that one, but here we are not > > > > > checking that? > > > > > I mean here we are calling supports_local_mode() only on G2H IIUC. > > > > > > > > Ah right, VMCI broke my original mental model of only needing this check > > > > for G2H (originally I didn't realize VMCI was H2G too). > > > > > > > > I think now, we actually need to do this check for all of the transports > > > > no? Including h2g, g2h, local, and dgram? > > > > > > > > Additionally, the commit description needs to be updated to reflect that. > > > > > > Let's take a step back, though, because I tried to understand the problem > > > better and I'm confused. > > > > > > For example, in vmci (G2H side), when a packet arrives, we always use > > > vsock_find_connected_socket(), which only searches in GLOBAL. So connections > > > originating from the host can only reach global sockets in the guest. In > > > this direction (host -> guest), we should be fine, right? > > > > > > Now let's consider the other direction, from guest to host, so the > > > connection should be generated via vsock_connect(). > > > Here I see that we are not doing anything with regard to the source > > > namespace. At this point, my question is whether we should modify > > > vsock_assign_transport() or transport->stream_allow() to do this for each > > > stream, and not prevent loading a G2H module a priori. > > > > > > For example, stream_allow() could check that the socket namespace is > > > supported by the assigned transport. E.g., vmci can check that if the > > > namespace mode is not GLOBAL, then it returns false. (Same thing in > > > virtio-vsock, I mean the G2H driver). > > > > > > This should solve the guest -> host direction, but at this point I wonder if > > > I'm missing something. > > > > For the G2H connect case that is true, but the situation gets a little > > fuzzier on the G2H RX side w/ VMADDR_CID_ANY listeners. > > > > Let's say we have a nested system w/ both virtio-vsock and vhost-vsock. > > We have a listener in namespace local on VMADDR_CID_ANY. So far, no > > transport is assigned, so we can't call t->stream_allow() yet. > > virtio-vsock only knows of global mode, so its lookup will fail (unless > > What is the problem of failing in this case? > I mean, we are documenting that G2H will not be able to reach socket in > namespaces with "local" mode. Old (and default) behaviour is still allowing > them, right? > > I don't think it conflicts with the definition of “local” either, because > these connections are coming from outside, and the user doesn't expect to be > able to receive them in a “local” namespace, unless there is a way to put > the device in the namespace (as with net). But this method doesn't exist > yet, and by documenting it sufficiently, we can say that it will be > supported in the future, but not for now. > > > we hack in some special case to virtio_transport_recv_pkt() to scan > > local namespaces). vhost-vsock will work as expected. Letting local mode > > sockets be silently unreachable by virtio-vsock seems potentially > > confusing for users. If the system were not nested, we can pre-resolve > > VMADDR_CID_ANY in bind() and handle things upfront as well. Rejecting > > local mode outright is just a broad guardrail. > > Okay, but in that case, we are not supporting “local” mode too, but we are > also preventing “global” from being used on these when we are in a nested > environment. What is the advantage of this approach? > > > > > If we're trying to find a less heavy-handed option, we might be able to > > do the following: > > > > - change bind(cid) w/ cid != VMADDR_CID_ANY to directly assign the > > transport > > for all socket types (not just SOCK_DGRAM) > > That would be nice, but it wouldn't solve the problem with VMADDR_CID_ANY, > which I guess is the use case in 99% of cases. > > > > > - vsock_assign_transport() can outright fail if !t->supports_local_mode() > > and sock_net(sk) has mode local > > But in this case, why not reusing stream_allow() ? > > > > > - bind(VMADDR_CID_ANY) can maybe print (once) to dmesg a warning that > > only the H2G transport will land on VMADDR_CID_ANY sockets. > > mmm, I'm not sure about that, we should ask net maintainer, but IMO > documenting that in af_vsock.c and man pages should be fine, till G2H will > support that. > > > > > I'm certainly open to other suggestions. > > IMO we should avoid the failure when loading G2H, which is more confusing > than just discard connection from the host to a "local" namespace. We should > try at least to support the "global" namespace. > > Thanks, > Stefano I'm 100% fine with that approach. I just wanted to make sure we landed in the right place for how users may encounter places that there is no local mode support. So for next steps, we can drop this patch and add explicit logic in ->stream_allow() to allow local mode for vhost/loopback and reject for others? Plus, add documentation about what happens for VMADDR_CID_ANY (will only receive vhost/loopback traffic in local mode)? Best, Bobby
On Mon, Nov 24, 2025 at 10:25:56AM -0800, Bobby Eshleman wrote: >On Mon, Nov 24, 2025 at 06:54:45PM +0100, Stefano Garzarella wrote: >> On Mon, Nov 24, 2025 at 09:29:05AM -0800, Bobby Eshleman wrote: >> > On Mon, Nov 24, 2025 at 11:10:19AM +0100, Stefano Garzarella wrote: >> > > On Fri, Nov 21, 2025 at 11:01:53AM -0800, Bobby Eshleman wrote: >> > > > On Fri, Nov 21, 2025 at 03:24:25PM +0100, Stefano Garzarella wrote: >> > > > > On Thu, Nov 20, 2025 at 09:44:35PM -0800, Bobby Eshleman wrote: >> >> [...] >> >> > > > >> > > > > Since I guess we need another version of this patch, can you check the >> > > > > commit description to see if it reflects what we are doing now >> > > > > (e.g vhost is not enabled)? >> > > > > >> > > > > Also I don't understand why for vhost we will enable it later, but for >> > > > > virtio_transport and vsock_loopback we are enabling it now, also if this >> > > > > patch is before the support on that transports. I'm a bit confused. >> > > > > >> > > > > If something is unclear, let's discuss it before sending a new version. >> > > > > >> > > > > >> > > > > What I had in mind was, add this patch and explain why we need this new >> > > > > callback (like you did), but enable the support in the patches that >> > > > > really enable it for any transport. But maybe what is not clear to me is >> > > > > that we need this only for G2H. But now I'm confused about the discussion >> > > > > around vmci H2G. We decided to discard also that one, but here we are not >> > > > > checking that? >> > > > > I mean here we are calling supports_local_mode() only on G2H IIUC. >> > > > >> > > > Ah right, VMCI broke my original mental model of only needing this check >> > > > for G2H (originally I didn't realize VMCI was H2G too). >> > > > >> > > > I think now, we actually need to do this check for all of the transports >> > > > no? Including h2g, g2h, local, and dgram? >> > > > >> > > > Additionally, the commit description needs to be updated to reflect that. >> > > >> > > Let's take a step back, though, because I tried to understand the problem >> > > better and I'm confused. >> > > >> > > For example, in vmci (G2H side), when a packet arrives, we always use >> > > vsock_find_connected_socket(), which only searches in GLOBAL. So connections >> > > originating from the host can only reach global sockets in the guest. In >> > > this direction (host -> guest), we should be fine, right? >> > > >> > > Now let's consider the other direction, from guest to host, so the >> > > connection should be generated via vsock_connect(). >> > > Here I see that we are not doing anything with regard to the source >> > > namespace. At this point, my question is whether we should modify >> > > vsock_assign_transport() or transport->stream_allow() to do this for each >> > > stream, and not prevent loading a G2H module a priori. >> > > >> > > For example, stream_allow() could check that the socket namespace is >> > > supported by the assigned transport. E.g., vmci can check that if the >> > > namespace mode is not GLOBAL, then it returns false. (Same thing in >> > > virtio-vsock, I mean the G2H driver). >> > > >> > > This should solve the guest -> host direction, but at this point I wonder if >> > > I'm missing something. >> > >> > For the G2H connect case that is true, but the situation gets a little >> > fuzzier on the G2H RX side w/ VMADDR_CID_ANY listeners. >> > >> > Let's say we have a nested system w/ both virtio-vsock and vhost-vsock. >> > We have a listener in namespace local on VMADDR_CID_ANY. So far, no >> > transport is assigned, so we can't call t->stream_allow() yet. >> > virtio-vsock only knows of global mode, so its lookup will fail (unless >> >> What is the problem of failing in this case? >> I mean, we are documenting that G2H will not be able to reach socket in >> namespaces with "local" mode. Old (and default) behaviour is still allowing >> them, right? >> >> I don't think it conflicts with the definition of “local” either, because >> these connections are coming from outside, and the user doesn't expect to be >> able to receive them in a “local” namespace, unless there is a way to put >> the device in the namespace (as with net). But this method doesn't exist >> yet, and by documenting it sufficiently, we can say that it will be >> supported in the future, but not for now. >> >> > we hack in some special case to virtio_transport_recv_pkt() to scan >> > local namespaces). vhost-vsock will work as expected. Letting local mode >> > sockets be silently unreachable by virtio-vsock seems potentially >> > confusing for users. If the system were not nested, we can pre-resolve >> > VMADDR_CID_ANY in bind() and handle things upfront as well. Rejecting >> > local mode outright is just a broad guardrail. >> >> Okay, but in that case, we are not supporting “local” mode too, but we are >> also preventing “global” from being used on these when we are in a nested >> environment. What is the advantage of this approach? >> >> > >> > If we're trying to find a less heavy-handed option, we might be able to >> > do the following: >> > >> > - change bind(cid) w/ cid != VMADDR_CID_ANY to directly assign the >> > transport >> > for all socket types (not just SOCK_DGRAM) >> >> That would be nice, but it wouldn't solve the problem with VMADDR_CID_ANY, >> which I guess is the use case in 99% of cases. >> >> > >> > - vsock_assign_transport() can outright fail if !t->supports_local_mode() >> > and sock_net(sk) has mode local >> >> But in this case, why not reusing stream_allow() ? >> >> > >> > - bind(VMADDR_CID_ANY) can maybe print (once) to dmesg a warning that >> > only the H2G transport will land on VMADDR_CID_ANY sockets. >> >> mmm, I'm not sure about that, we should ask net maintainer, but IMO >> documenting that in af_vsock.c and man pages should be fine, till G2H will >> support that. >> >> > >> > I'm certainly open to other suggestions. >> >> IMO we should avoid the failure when loading G2H, which is more confusing >> than just discard connection from the host to a "local" namespace. We should >> try at least to support the "global" namespace. >> >> Thanks, >> Stefano > > >I'm 100% fine with that approach. I just wanted to make sure we landed >in the right place for how users may encounter places that there is no >local mode support. Yeah, I see, thanks for that! > >So for next steps, we can drop this patch and add explicit logic in >->stream_allow() to allow local mode for vhost/loopback and reject for >others? Yep, I would add the logic in the "vsock: add netns to vsock core" patch, including the changes to stream_allow(), supporting in all transports only the global mode. In the next patches we can support `local` mode in related transports (I guess for now just loopback and vhost-vsock). > Plus, add documentation about what happens for VMADDR_CID_ANY >(will only receive vhost/loopback traffic in local mode)? I'd document that in af_vsock.c when we talk about "local". I'll make it clear that not all transports support it, and we can mention that example. When we will merge this series, we should also send a patch to the vsock(7) manpage [1] to describe namespace support because I guess that will be the entry point of the user. Thanks, Stefano [1] https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/man/man7/vsock.7
© 2016 - 2025 Red Hat, Inc.