net/ethtool/netlink.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
The ethtool operations may be performed on a net device that is currently
being unregistered. This is also described in the commit dde91ccfa25fd58f
("ethtool: do not perform operations on net devices being unregistered").
Moreover, such a device may have a parent that has already been freed.
Let's check the net device reg_state in ethnl_ops_begin() before calling
pm_runtime_get_sync() for its parent device to avoid use-after-free
like this:
==================================================================
BUG: KASAN: slab-use-after-free in __pm_runtime_resume+0xe2/0xf0
Read of size 2 at addr ffff88810cfc46f8 by task pm/606
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x70
print_report+0x170/0x4f3
? __pfx__raw_spin_lock_irqsave+0x10/0x10
kasan_report+0xda/0x110
? __pm_runtime_resume+0xe2/0xf0
? __pm_runtime_resume+0xe2/0xf0
__pm_runtime_resume+0xe2/0xf0
ethnl_ops_begin+0x49/0x270
ethnl_set_features+0x23c/0xab0
? __pfx_ethnl_set_features+0x10/0x10
? kvm_sched_clock_read+0x11/0x20
? local_clock_noinstr+0xf/0xf0
? local_clock+0x10/0x30
? kasan_save_track+0x25/0x60
? __kasan_kmalloc+0x7f/0x90
? genl_family_rcv_msg_attrs_parse.isra.0+0x150/0x2c0
genl_family_rcv_msg_doit+0x1e7/0x2c0
? __pfx_genl_family_rcv_msg_doit+0x10/0x10
? __pfx_cred_has_capability.isra.0+0x10/0x10
? stack_trace_save+0x8e/0xc0
genl_rcv_msg+0x411/0x660
? __pfx_genl_rcv_msg+0x10/0x10
? __pfx_ethnl_set_features+0x10/0x10
netlink_rcv_skb+0x121/0x380
? __pfx_genl_rcv_msg+0x10/0x10
? __pfx_netlink_rcv_skb+0x10/0x10
? __pfx_down_read+0x10/0x10
genl_rcv+0x23/0x30
netlink_unicast+0x60f/0x830
? __pfx_netlink_unicast+0x10/0x10
? __pfx___alloc_skb+0x10/0x10
netlink_sendmsg+0x6ea/0xbc0
? __pfx_netlink_sendmsg+0x10/0x10
? __futex_queue+0x10b/0x1f0
____sys_sendmsg+0x7a2/0x950
? copy_msghdr_from_user+0x26b/0x430
? __pfx_____sys_sendmsg+0x10/0x10
? __pfx_copy_msghdr_from_user+0x10/0x10
___sys_sendmsg+0xf8/0x180
? __pfx____sys_sendmsg+0x10/0x10
? __pfx_futex_wait+0x10/0x10
? fdget+0x2e4/0x4a0
__sys_sendmsg+0x11f/0x1c0
? __pfx___sys_sendmsg+0x10/0x10
do_syscall_64+0xe2/0x570
? exc_page_fault+0x66/0xb0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Fixes: d43c65b05b848e0b ("ethtool: runtime-resume netdev parent in ethnl_ops_begin")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
net/ethtool/netlink.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 6e5f0f4f815a..1cdedd1af24d 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -94,17 +94,16 @@ int ethnl_ops_begin(struct net_device *dev)
if (!dev)
return -ENODEV;
- if (dev->dev.parent)
- pm_runtime_get_sync(dev->dev.parent);
-
netdev_ops_assert_locked(dev);
if (!netif_device_present(dev) ||
dev->reg_state >= NETREG_UNREGISTERING) {
- ret = -ENODEV;
- goto err;
+ return -ENODEV;
}
+ if (dev->dev.parent)
+ pm_runtime_get_sync(dev->dev.parent);
+
if (dev->ethtool_ops->begin) {
ret = dev->ethtool_ops->begin(dev);
if (ret)
--
2.53.0
On Sun, Mar 22, 2026 at 10:59:15AM +0300, Alexander Popov wrote:
> The ethtool operations may be performed on a net device that is currently
> being unregistered. This is also described in the commit dde91ccfa25fd58f
> ("ethtool: do not perform operations on net devices being unregistered").
> Moreover, such a device may have a parent that has already been freed.
I don't know the device core too well, but can you explain in the
example stack trace you gave how the parent can already be free? I
would of expected devices are arranged in a tree, and free happens
from the leaves towards the root? In order to do a clean shutdown, you
might need to perform operations on the parent.
I just want to make sure this is not a broken parent device which
should be fixed...
Andrew
On 3/22/26 17:39, Andrew Lunn wrote:
> On Sun, Mar 22, 2026 at 10:59:15AM +0300, Alexander Popov wrote:
>> The ethtool operations may be performed on a net device that is currently
>> being unregistered. This is also described in the commit dde91ccfa25fd58f
>> ("ethtool: do not perform operations on net devices being unregistered").
>> Moreover, such a device may have a parent that has already been freed.
>
> I don't know the device core too well, but can you explain in the
> example stack trace you gave how the parent can already be free? I
> would of expected devices are arranged in a tree, and free happens
> from the leaves towards the root? In order to do a clean shutdown, you
> might need to perform operations on the parent.
>
> I just want to make sure this is not a broken parent device which
> should be fixed...
Hello Andrew, let me describe the scenario that I see:
- The netdev_run_todo() function handles the net devices in net_todo_list
in a loop and moves each of them into the NETREG_UNREGISTERED state:
netdev_lock(dev);
WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERED);
netdev_unlock(dev);
- Then netdev_run_todo() frees these net devices in another loop.
On each iteration, it chooses a device for freeing:
dev = netdev_wait_allrefs_any(&list);
- At the same time, the ethnl_set_features() function calls
ethnl_parse_header_dev_get() for the child net device.
- If the race condition succeeds, ethnl_set_features() takes the reference
to the child net device being unregistered. That makes netdev_run_todo()
free the parent first.
Best regards,
Alexander
On Mon, 23 Mar 2026 02:08:53 +0300 Alexander Popov wrote: > Hello Andrew, let me describe the scenario that I see: > > - The netdev_run_todo() function handles the net devices in net_todo_list > in a loop and moves each of them into the NETREG_UNREGISTERED state: > netdev_lock(dev); > WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERED); > netdev_unlock(dev); > > - Then netdev_run_todo() frees these net devices in another loop. > On each iteration, it chooses a device for freeing: > dev = netdev_wait_allrefs_any(&list); > > - At the same time, the ethnl_set_features() function calls > ethnl_parse_header_dev_get() for the child net device. > > - If the race condition succeeds, ethnl_set_features() takes the reference > to the child net device being unregistered. That makes netdev_run_todo() > free the parent first. That's not sufficient detail. ethnl_parse_header_dev_get() is under RCU and unregistration does an RCU sync after delisting the device. Also not sure you're distinguishing struct net_device and struct device. How did you hit this issue? What are the net devices involved?
On 3/24/26 01:08, Jakub Kicinski wrote: > On Mon, 23 Mar 2026 02:08:53 +0300 Alexander Popov wrote: >> Hello Andrew, let me describe the scenario that I see: >> >> - The netdev_run_todo() function handles the net devices in net_todo_list >> in a loop and moves each of them into the NETREG_UNREGISTERED state: >> netdev_lock(dev); >> WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERED); >> netdev_unlock(dev); >> >> - Then netdev_run_todo() frees these net devices in another loop. >> On each iteration, it chooses a device for freeing: >> dev = netdev_wait_allrefs_any(&list); >> >> - At the same time, the ethnl_set_features() function calls >> ethnl_parse_header_dev_get() for the child net device. >> >> - If the race condition succeeds, ethnl_set_features() takes the reference >> to the child net device being unregistered. That makes netdev_run_todo() >> free the parent first. > > That's not sufficient detail. ethnl_parse_header_dev_get() is under RCU > and unregistration does an RCU sync after delisting the device. Also > not sure you're distinguishing struct net_device and struct device. > > How did you hit this issue? What are the net devices involved? I've provided additional details about the reproducer of this vulnerability to Jakub and to security@kernel.org. Best regards, Alexander
On 25 March 2026 03:46:20 GMT+09:00, Alexander Popov <alex.popov@linux.com> wrote: >On 3/24/26 01:08, Jakub Kicinski wrote: >> On Mon, 23 Mar 2026 02:08:53 +0300 Alexander Popov wrote: >>> Hello Andrew, let me describe the scenario that I see: >>> >>> - The netdev_run_todo() function handles the net devices in net_todo_list >>> in a loop and moves each of them into the NETREG_UNREGISTERED state: >>> netdev_lock(dev); >>> WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERED); >>> netdev_unlock(dev); >>> >>> - Then netdev_run_todo() frees these net devices in another loop. >>> On each iteration, it chooses a device for freeing: >>> dev = netdev_wait_allrefs_any(&list); >>> >>> - At the same time, the ethnl_set_features() function calls >>> ethnl_parse_header_dev_get() for the child net device. >>> >>> - If the race condition succeeds, ethnl_set_features() takes the reference >>> to the child net device being unregistered. That makes netdev_run_todo() >>> free the parent first. >> >> That's not sufficient detail. ethnl_parse_header_dev_get() is under RCU >> and unregistration does an RCU sync after delisting the device. Also >> not sure you're distinguishing struct net_device and struct device. >> >> How did you hit this issue? What are the net devices involved? > >I've provided additional details about the reproducer of this vulnerability to Jakub and to security@kernel.org. Hello! May I ask about the decision on this patch? At patchwork.kernel.org, it is marked as "Changes Requested": <https://patchwork.kernel.org/project/netdevbpf/patch/20260322075917.254874-1-alex.popov@linux.com/> However, I don't have any instructions on what to change in it. Thanks! Alexander
On Sun, 29 Mar 2026 17:47:30 +0900 Alexander Popov wrote: > >I've provided additional details about the reproducer of this vulnerability to Jakub and to security@kernel.org. > > Hello! May I ask about the decision on this patch? > > At patchwork.kernel.org, it is marked as "Changes Requested": > <https://patchwork.kernel.org/project/netdevbpf/patch/20260322075917.254874-1-alex.popov@linux.com/> > > However, I don't have any instructions on what to change in it. The change was to fix the wifi driver. Please try not to waste more maintainer time than necessary.
> >> That's not sufficient detail. ethnl_parse_header_dev_get() is under RCU
> >> and unregistration does an RCU sync after delisting the device. Also
> >> not sure you're distinguishing struct net_device and struct device.
> >>
> >> How did you hit this issue? What are the net devices involved?
> >
> >I've provided additional details about the reproducer of this vulnerability to Jakub and to security@kernel.org.
>
> Hello! May I ask about the decision on this patch?
>
> At patchwork.kernel.org, it is marked as "Changes Requested":
> <https://patchwork.kernel.org/project/netdevbpf/patch/20260322075917.254874-1-alex.popov@linux.com/>
>
> However, I don't have any instructions on what to change in it.
You need to add to the commit message to explain how it can happen.
My question said i _think_ devices are arranged in a tree, and release
happens from the leaves towards the root. So a parent should not be
release first. If i'm correct, this crash indicates a problem
somewhere else, not here. So we should fix that. Your commit message
needs to convince me the change is fixing the real problem, not
papering over the cracks.
Andrew
© 2016 - 2026 Red Hat, Inc.