[PATCH net,v2] hv_netvsc: Switch VF namespace in netvsc_open instead

Haiyang Zhang posted 1 patch 2 months, 3 weeks ago
drivers/net/hyperv/netvsc_drv.c | 43 ++++++++++-----------------------
1 file changed, 13 insertions(+), 30 deletions(-)
[PATCH net,v2] hv_netvsc: Switch VF namespace in netvsc_open instead
Posted by Haiyang Zhang 2 months, 3 weeks ago
From: Haiyang Zhang <haiyangz@microsoft.com>

The existing code move the VF NIC to new namespace when NETDEV_REGISTER is
received on netvsc NIC. During deletion of the namespace,
default_device_exit_batch() >> default_device_exit_net() is called. When
netvsc NIC is moved back and registered to the default namespace, it
automatically brings VF NIC back to the default namespace. This will cause
the default_device_exit_net() >> for_each_netdev_safe loop unable to detect
the list end, and hit NULL ptr:

[  231.449420] mana 7870:00:00.0 enP30832s1: Moved VF to namespace with: eth0
[  231.449656] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  231.450246] #PF: supervisor read access in kernel mode
[  231.450579] #PF: error_code(0x0000) - not-present page
[  231.450916] PGD 17b8a8067 P4D 0 
[  231.451163] Oops: Oops: 0000 [#1] SMP NOPTI
[  231.451450] CPU: 82 UID: 0 PID: 1394 Comm: kworker/u768:1 Not tainted 6.16.0-rc4+ #3 VOLUNTARY 
[  231.452042] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/21/2024
[  231.452692] Workqueue: netns cleanup_net
[  231.452947] RIP: 0010:default_device_exit_batch+0x16c/0x3f0
[  231.453326] Code: c0 0c f5 b3 e8 d5 db fe ff 48 85 c0 74 15 48 c7 c2 f8 fd ca b2 be 10 00 00 00 48 8d 7d c0 e8 7b 77 25 00 49 8b 86 28 01 00 00 <48> 8b 50 10 4c 8b 2a 4c 8d 62 f0 49 83 ed 10 4c 39 e0 0f 84 d6 00
[  231.454294] RSP: 0018:ff75fc7c9bf9fd00 EFLAGS: 00010246
[  231.454610] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 61c8864680b583eb
[  231.455094] RDX: ff1fa9f71462d800 RSI: ff75fc7c9bf9fd38 RDI: 0000000030766564
[  231.455686] RBP: ff75fc7c9bf9fd78 R08: 0000000000000000 R09: 0000000000000000
[  231.456126] R10: 0000000000000001 R11: 0000000000000004 R12: ff1fa9f70088e340
[  231.456621] R13: ff1fa9f70088e340 R14: ffffffffb3f50c20 R15: ff1fa9f7103e6340
[  231.457161] FS:  0000000000000000(0000) GS:ff1faa6783a08000(0000) knlGS:0000000000000000
[  231.457707] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  231.458031] CR2: 0000000000000010 CR3: 0000000179ab2006 CR4: 0000000000b73ef0
[  231.458434] Call Trace:
[  231.458600]  <TASK>
[  231.458777]  ops_undo_list+0x100/0x220
[  231.459015]  cleanup_net+0x1b8/0x300
[  231.459285]  process_one_work+0x184/0x340

To fix it, move the VF namespace switching code from the NETDEV_REGISTER
event handler to netvsc_open().

Cc: stable@vger.kernel.org
Cc: cavery@redhat.com
Fixes: 4c262801ea60 ("hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
v2: verified it's applicable to net, fixed cc list.

---
 drivers/net/hyperv/netvsc_drv.c | 43 ++++++++++-----------------------
 1 file changed, 13 insertions(+), 30 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 42d98e99566e..074ecc346108 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -135,6 +135,19 @@ static int netvsc_open(struct net_device *net)
 	}
 
 	if (vf_netdev) {
+		if (!net_eq(dev_net(net), dev_net(vf_netdev))) {
+			ret = dev_change_net_namespace(vf_netdev, dev_net(net),
+						       "eth%d");
+			if (ret)
+				netdev_err(vf_netdev,
+					   "Cannot move to same ns as %s: %d\n",
+					   net->name, ret);
+			else
+				netdev_info(vf_netdev,
+					    "Moved VF to namespace with: %s\n",
+					    net->name);
+		}
+
 		/* Setting synthetic device up transparently sets
 		 * slave as up. If open fails, then slave will be
 		 * still be offline (and not used).
@@ -2772,31 +2785,6 @@ static struct  hv_driver netvsc_drv = {
 	},
 };
 
-/* Set VF's namespace same as the synthetic NIC */
-static void netvsc_event_set_vf_ns(struct net_device *ndev)
-{
-	struct net_device_context *ndev_ctx = netdev_priv(ndev);
-	struct net_device *vf_netdev;
-	int ret;
-
-	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
-	if (!vf_netdev)
-		return;
-
-	if (!net_eq(dev_net(ndev), dev_net(vf_netdev))) {
-		ret = dev_change_net_namespace(vf_netdev, dev_net(ndev),
-					       "eth%d");
-		if (ret)
-			netdev_err(vf_netdev,
-				   "Cannot move to same namespace as %s: %d\n",
-				   ndev->name, ret);
-		else
-			netdev_info(vf_netdev,
-				    "Moved VF to namespace with: %s\n",
-				    ndev->name);
-	}
-}
-
 /*
  * On Hyper-V, every VF interface is matched with a corresponding
  * synthetic interface. The synthetic interface is presented first
@@ -2809,11 +2797,6 @@ static int netvsc_netdev_event(struct notifier_block *this,
 	struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
 	int ret = 0;
 
-	if (event_dev->netdev_ops == &device_ops && event == NETDEV_REGISTER) {
-		netvsc_event_set_vf_ns(event_dev);
-		return NOTIFY_DONE;
-	}
-
 	ret = check_dev_is_matching_vf(event_dev);
 	if (ret != 0)
 		return NOTIFY_DONE;
-- 
2.34.1
Re: [PATCH net,v2] hv_netvsc: Switch VF namespace in netvsc_open instead
Posted by Simon Horman 2 months, 3 weeks ago
On Mon, Jul 14, 2025 at 09:41:37AM -0700, Haiyang Zhang wrote:
> From: Haiyang Zhang <haiyangz@microsoft.com>
> 
> The existing code move the VF NIC to new namespace when NETDEV_REGISTER is
> received on netvsc NIC. During deletion of the namespace,
> default_device_exit_batch() >> default_device_exit_net() is called. When
> netvsc NIC is moved back and registered to the default namespace, it
> automatically brings VF NIC back to the default namespace. This will cause
> the default_device_exit_net() >> for_each_netdev_safe loop unable to detect
> the list end, and hit NULL ptr:
> 
> [  231.449420] mana 7870:00:00.0 enP30832s1: Moved VF to namespace with: eth0
> [  231.449656] BUG: kernel NULL pointer dereference, address: 0000000000000010
> [  231.450246] #PF: supervisor read access in kernel mode
> [  231.450579] #PF: error_code(0x0000) - not-present page
> [  231.450916] PGD 17b8a8067 P4D 0 
> [  231.451163] Oops: Oops: 0000 [#1] SMP NOPTI
> [  231.451450] CPU: 82 UID: 0 PID: 1394 Comm: kworker/u768:1 Not tainted 6.16.0-rc4+ #3 VOLUNTARY 
> [  231.452042] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/21/2024
> [  231.452692] Workqueue: netns cleanup_net
> [  231.452947] RIP: 0010:default_device_exit_batch+0x16c/0x3f0
> [  231.453326] Code: c0 0c f5 b3 e8 d5 db fe ff 48 85 c0 74 15 48 c7 c2 f8 fd ca b2 be 10 00 00 00 48 8d 7d c0 e8 7b 77 25 00 49 8b 86 28 01 00 00 <48> 8b 50 10 4c 8b 2a 4c 8d 62 f0 49 83 ed 10 4c 39 e0 0f 84 d6 00
> [  231.454294] RSP: 0018:ff75fc7c9bf9fd00 EFLAGS: 00010246
> [  231.454610] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 61c8864680b583eb
> [  231.455094] RDX: ff1fa9f71462d800 RSI: ff75fc7c9bf9fd38 RDI: 0000000030766564
> [  231.455686] RBP: ff75fc7c9bf9fd78 R08: 0000000000000000 R09: 0000000000000000
> [  231.456126] R10: 0000000000000001 R11: 0000000000000004 R12: ff1fa9f70088e340
> [  231.456621] R13: ff1fa9f70088e340 R14: ffffffffb3f50c20 R15: ff1fa9f7103e6340
> [  231.457161] FS:  0000000000000000(0000) GS:ff1faa6783a08000(0000) knlGS:0000000000000000
> [  231.457707] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  231.458031] CR2: 0000000000000010 CR3: 0000000179ab2006 CR4: 0000000000b73ef0
> [  231.458434] Call Trace:
> [  231.458600]  <TASK>
> [  231.458777]  ops_undo_list+0x100/0x220
> [  231.459015]  cleanup_net+0x1b8/0x300
> [  231.459285]  process_one_work+0x184/0x340
> 
> To fix it, move the VF namespace switching code from the NETDEV_REGISTER
> event handler to netvsc_open().
> 
> Cc: stable@vger.kernel.org
> Cc: cavery@redhat.com
> Fixes: 4c262801ea60 ("hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event")
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>

With this change do we go back to the situation that existed prior
to the cited patch? Quoting the cited commit:

    The existing code moves VF to the same namespace as the synthetic NIC
    during netvsc_register_vf(). But, if the synthetic device is moved to a
    new namespace after the VF registration, the VF won't be moved together.

Or perhaps not because if synthetic device is moved then, in practice, it
will subsequently be reopened? (Because it is closed as part of the move
to a different netns?)

I am unsure.