As a nearly-final step in register_netdevice(), finalize the name in the
refcount tracker, and register a debugfs file for it.
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
net/core/dev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 2f7f5fd9ffec7c0fc219eb6ba57d57a55134186e..377a19240f2b774281088d91e7da3e4b6ec52e33 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10834,8 +10834,9 @@ static void netdev_free_phy_link_topology(struct net_device *dev)
*/
int register_netdevice(struct net_device *dev)
{
- int ret;
struct net *net = dev_net(dev);
+ char name[NAME_MAX + 1];
+ int ret;
BUILD_BUG_ON(sizeof(netdev_features_t) * BITS_PER_BYTE <
NETDEV_FEATURE_COUNT);
@@ -10994,6 +10995,9 @@ int register_netdevice(struct net_device *dev)
dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
rtmsg_ifinfo(RTM_NEWLINK, dev, ~0U, GFP_KERNEL, 0, NULL);
+ /* Register debugfs file for the refcount tracker */
+ if (snprintf(name, sizeof(name), "netdev-%s@%p", dev->name, dev) < sizeof(name))
+ ref_tracker_dir_debugfs(&dev->refcnt_tracker, name);
out:
return ret;
--
2.49.0
On Fri, 18 Apr 2025 10:24:31 -0400 Jeff Layton wrote: > + /* Register debugfs file for the refcount tracker */ > + if (snprintf(name, sizeof(name), "netdev-%s@%p", dev->name, dev) < sizeof(name)) > + ref_tracker_dir_debugfs(&dev->refcnt_tracker, name); Names are not unique and IIUC debugfs is not namespaced. How much naming the objects in a "user readable" fashion actually matter? It'd be less churn to create some kind of "object class" with a directory level named after what's already passed to ref_tracker_dir_init() and then id the objects by the pointer value as sub-dirs of that?
On Wed, 2025-04-23 at 16:53 -0700, Jakub Kicinski wrote: > On Fri, 18 Apr 2025 10:24:31 -0400 Jeff Layton wrote: > > + /* Register debugfs file for the refcount tracker */ > > + if (snprintf(name, sizeof(name), "netdev-%s@%p", dev->name, dev) < sizeof(name)) > > + ref_tracker_dir_debugfs(&dev->refcnt_tracker, name); > > Names are not unique and IIUC debugfs is not namespaced. > How much naming the objects in a "user readable" fashion actually > matter? It'd be less churn to create some kind of "object class" > with a directory level named after what's already passed to > ref_tracker_dir_init() and then id the objects by the pointer value > as sub-dirs of that? That sounds closer to what I had done originally. Andrew L. suggested the flat directory that this version represents. I'm fine with whatever hierarchy, but let's decide that before I respin again. When I was tracking down net namespace leaks recently, it was very nice to have the inode number of the leaked netns's in the filenames. I would have probably had to grovel around with drgn to figure out the address if that weren't embedded in the name. I think we probably ought to leave it up to each subsystem how it names its files. The discriminators between different types of objects can vary wildly. One thing that might be simpler is to make ref_tracker_dir_debugfs() a varargs function and allow passing in a format string and set of arguments for it. That might make things simpler for the callers. -- Jeff Layton <jlayton@kernel.org>
On Wed, 23 Apr 2025 20:04:58 -0400 Jeff Layton wrote: > On Wed, 2025-04-23 at 16:53 -0700, Jakub Kicinski wrote: > > Names are not unique and IIUC debugfs is not namespaced. > > How much naming the objects in a "user readable" fashion actually > > matter? It'd be less churn to create some kind of "object class" > > with a directory level named after what's already passed to > > ref_tracker_dir_init() and then id the objects by the pointer value > > as sub-dirs of that? > > That sounds closer to what I had done originally. Andrew L. suggested > the flat directory that this version represents. I'm fine with whatever > hierarchy, but let's decide that before I respin again. Sorry about that :( > When I was tracking down net namespace leaks recently, it was very nice > to have the inode number of the leaked netns's in the filenames. I > would have probably had to grovel around with drgn to figure out the > address if that weren't embedded in the name. I think we probably ought > to leave it up to each subsystem how it names its files. The > discriminators between different types of objects can vary wildly. > > One thing that might be simpler is to make ref_tracker_dir_debugfs() a > varargs function and allow passing in a format string and set of > arguments for it. That might make things simpler for the callers. Yes, cutting out the formatting in the callers would definitely be a win. Maybe that'd make the whole thing sufficiently palatable :)
On Wed, 2025-04-23 at 17:32 -0700, Jakub Kicinski wrote:
> On Wed, 23 Apr 2025 20:04:58 -0400 Jeff Layton wrote:
> > On Wed, 2025-04-23 at 16:53 -0700, Jakub Kicinski wrote:
> > > Names are not unique and IIUC debugfs is not namespaced.
Correct, debugfs is not namespaced.
I meant to say earlier that I'm open to suggestions on how to make the
netdev names unique. Low-level netdev stuff is not my area of
expertise. We can drop this patch if doing so is problematic.
> > > How much naming the objects in a "user readable" fashion actually
> > > matter? It'd be less churn to create some kind of "object class"
> > > with a directory level named after what's already passed to
> > > ref_tracker_dir_init() and then id the objects by the pointer value
> > > as sub-dirs of that?
> >
> > That sounds closer to what I had done originally. Andrew L. suggested
> > the flat directory that this version represents. I'm fine with whatever
> > hierarchy, but let's decide that before I respin again.
>
> Sorry about that :(
>
No worries...but we do need to decide what this directory hierarchy
should look like.
Andrew's point earlier was that this is just debugfs, so a flat
"ref_tracker" directory full of files is fine. I tend to agree with
him; NAME_MAX is 255, so we have plenty of room to make uniquely-named
files.
We could build a dir hierarchy though. Something like:
- ref_tracker
+ netdev
+ netns
...and then each object class subdir would hold uniquely-named files.
Each subsys would need to create the directory and then the files
within it, but we could add helpers for that.
Note that this is debugfs, so we can always change that later as it's
not counted as part of ABI.
Which way should we do it?
> > When I was tracking down net namespace leaks recently, it was very nice
> > to have the inode number of the leaked netns's in the filenames. I
> > would have probably had to grovel around with drgn to figure out the
> > address if that weren't embedded in the name. I think we probably ought
> > to leave it up to each subsystem how it names its files. The
> > discriminators between different types of objects can vary wildly.
> >
> > One thing that might be simpler is to make ref_tracker_dir_debugfs() a
> > varargs function and allow passing in a format string and set of
> > arguments for it. That might make things simpler for the callers.
>
> Yes, cutting out the formatting in the callers would definitely
> be a win. Maybe that'd make the whole thing sufficiently palatable :)
Ok, I'll plan to change that. That does make the callers a lot cleaner.
--
Jeff Layton <jlayton@kernel.org>
On Thu, Apr 24, 2025 at 06:56:06AM -0400, Jeff Layton wrote: > On Wed, 2025-04-23 at 17:32 -0700, Jakub Kicinski wrote: > > On Wed, 23 Apr 2025 20:04:58 -0400 Jeff Layton wrote: > > > On Wed, 2025-04-23 at 16:53 -0700, Jakub Kicinski wrote: > > > > Names are not unique and IIUC debugfs is not namespaced. > > Correct, debugfs is not namespaced. > > I meant to say earlier that I'm open to suggestions on how to make the > netdev names unique. Low-level netdev stuff is not my area of > expertise. We can drop this patch if doing so is problematic. > > > > > How much naming the objects in a "user readable" fashion actually > > > > matter? It'd be less churn to create some kind of "object class" > > > > with a directory level named after what's already passed to > > > > ref_tracker_dir_init() and then id the objects by the pointer value > > > > as sub-dirs of that? > > > > > > That sounds closer to what I had done originally. Andrew L. suggested > > > the flat directory that this version represents. I'm fine with whatever > > > hierarchy, but let's decide that before I respin again. > > > > Sorry about that :( > > > > No worries...but we do need to decide what this directory hierarchy > should look like. > > Andrew's point earlier was that this is just debugfs, so a flat > "ref_tracker" directory full of files is fine. I tend to agree with > him; NAME_MAX is 255, so we have plenty of room to make uniquely-named > files. > > We could build a dir hierarchy though. Something like: > > - ref_tracker > + netdev > + netns How do you make that generic? How due the GPU users of reftracker fit in? And whatever the next users are? A flat directory keeps it simple. Anybody capable of actually using this has to have a level of intelligence sufficient for glob(3). However, a varargs format function does make sense, since looking at the current users, many of them will need it. Andrew
On Thu, 24 Apr 2025 14:10:03 +0200 Andrew Lunn wrote:
> > > > > How much naming the objects in a "user readable" fashion actually
> > > > > matter? It'd be less churn to create some kind of "object class"
> > > > > with a directory level named after what's already passed to
> > > > > ref_tracker_dir_init() and then id the objects by the pointer value
> > > > > as sub-dirs of that?
> > > >
> > > > That sounds closer to what I had done originally. Andrew L. suggested
> > > > the flat directory that this version represents. I'm fine with whatever
> > > > hierarchy, but let's decide that before I respin again.
> > >
> > > Sorry about that :(
> > >
> >
> > No worries...but we do need to decide what this directory hierarchy
> > should look like.
> >
> > Andrew's point earlier was that this is just debugfs, so a flat
> > "ref_tracker" directory full of files is fine. I tend to agree with
> > him; NAME_MAX is 255, so we have plenty of room to make uniquely-named
> > files.
> >
> > We could build a dir hierarchy though. Something like:
> >
> > - ref_tracker
> > + netdev
> > + netns
>
> How do you make that generic? How due the GPU users of reftracker fit
> in? And whatever the next users are? A flat directory keeps it
> simple. Anybody capable of actually using this has to have a level of
> intelligence sufficient for glob(3).
>
> However, a varargs format function does make sense, since looking at
> the current users, many of them will need it.
No preference on my side about the hierarchy TBH. I just defaulted to
thinking about it in terms of a hierarchy class/id rather than class-id
but shouldn't matter.
The main point I was trying to make was about using a synthetic ID -
like the pointer value. For the netdevs this patchset waits until
the very end of the registration process to add the debugfs dir
because (I'm guessing) the name isn't assigned when we alloc
the device (and therefore when we call ref_tracker_dir_init()).
Using synthetic ID lets us register the debugfs from
ref_tracker_dir_init().
In fact using "registered name" can be misleading. In modern setups
where devices are renamed based on the system topology, after
registration.
The Ethernet interface on my laptop is called enp0s13f0u1u1,
not eth0. It is renamed by systemd right _after_ registration.
[45224.911324] r8152 2-1.1:1.0 eth0: v1.12.13
[45225.220032] r8152 2-1.1:1.0 enp0s13f0u1u1: renamed from eth0
so in (most?) modern systems the name we carefully printed
into the debugfs name will in fact not match the current device name.
What more we don't try to keep the IDs monotonically increasing.
If I plug in another Ethernet card it will also be called eth0 when
it's registered, again. You can experiment by adding dummy devices:
# ip link add type dummy
# ip link
...
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff
# ip link set dev dummy0 name other-name
[ 206.747381][ T670] other-name: renamed from dummy0
# ip link
...
2: other-name: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff
# ip link add type dummy
# ip link
...
2: other-name: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff
3: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff
The second device is called dummy0, again, because that name was
"free". So with the current code we delay the registration of debugfs
just to provide a name which is in fact misleading :S
But with all that said, I guess you still want the "meaningful" ID for
the netns, and that one is in fact stable :S
On Thu, 2025-04-24 at 15:52 -0700, Jakub Kicinski wrote: > On Thu, 24 Apr 2025 14:10:03 +0200 Andrew Lunn wrote: > > > > > > How much naming the objects in a "user readable" fashion actually > > > > > > matter? It'd be less churn to create some kind of "object class" > > > > > > with a directory level named after what's already passed to > > > > > > ref_tracker_dir_init() and then id the objects by the pointer value > > > > > > as sub-dirs of that? > > > > > > > > > > That sounds closer to what I had done originally. Andrew L. suggested > > > > > the flat directory that this version represents. I'm fine with whatever > > > > > hierarchy, but let's decide that before I respin again. > > > > > > > > Sorry about that :( > > > > > > > > > > No worries...but we do need to decide what this directory hierarchy > > > should look like. > > > > > > Andrew's point earlier was that this is just debugfs, so a flat > > > "ref_tracker" directory full of files is fine. I tend to agree with > > > him; NAME_MAX is 255, so we have plenty of room to make uniquely-named > > > files. > > > > > > We could build a dir hierarchy though. Something like: > > > > > > - ref_tracker > > > + netdev > > > + netns > > > > How do you make that generic? How due the GPU users of reftracker fit > > in? And whatever the next users are? A flat directory keeps it > > simple. Anybody capable of actually using this has to have a level of > > intelligence sufficient for glob(3). > > > > However, a varargs format function does make sense, since looking at > > the current users, many of them will need it. > > No preference on my side about the hierarchy TBH. I just defaulted to > thinking about it in terms of a hierarchy class/id rather than class-id > but shouldn't matter. > > The main point I was trying to make was about using a synthetic ID - > like the pointer value. For the netdevs this patchset waits until > the very end of the registration process to add the debugfs dir > because (I'm guessing) the name isn't assigned when we alloc > the device (and therefore when we call ref_tracker_dir_init()). > > Using synthetic ID lets us register the debugfs from > ref_tracker_dir_init(). > Yes, exactly. dev->name doesn't get populated until later, so we have to delay creating the debugfs file if we want the name to be meaningful. Ditto for the netns and its inode number. We certainly could move to "classname@%px" format. If we go that route, I'd suggest that we convert the ref_tracker_dir.name field to a const char * pointer, and have the ref_tracker_dir_init() callers pass in a pointer to a static classname string. No need to keep copies in that case. The i915 ref_trackers would need to be changed, but hopefully that's not a problem. With that change we could register the debugfs files in ref_tracker_dir_init() instead of having an extra call. The caveat there is that some of these objects get created very early (e.g. init_net), before debugfs comes online, so we'll miss creating debugfs files for those objects. Maybe that's no big deal. > In fact using "registered name" can be misleading. In modern setups > where devices are renamed based on the system topology, after > registration. > > The Ethernet interface on my laptop is called enp0s13f0u1u1, > not eth0. It is renamed by systemd right _after_ registration. > > [45224.911324] r8152 2-1.1:1.0 eth0: v1.12.13 > [45225.220032] r8152 2-1.1:1.0 enp0s13f0u1u1: renamed from eth0 > > so in (most?) modern systems the name we carefully printed > into the debugfs name will in fact not match the current device name. > What more we don't try to keep the IDs monotonically increasing. > If I plug in another Ethernet card it will also be called eth0 when > it's registered, again. You can experiment by adding dummy devices: > > # ip link add type dummy > # ip link > ... > 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff > > # ip link set dev dummy0 name other-name > [ 206.747381][ T670] other-name: renamed from dummy0 > # ip link > ... > 2: other-name: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff > > # ip link add type dummy > # ip link > ... > 2: other-name: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff > 3: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether b2:51:ee:5b:4b:83 brd ff:ff:ff:ff:ff:ff > > > The second device is called dummy0, again, because that name was > "free". So with the current code we delay the registration of debugfs > just to provide a name which is in fact misleading :S > Lovely. > But with all that said, I guess you still want the "meaningful" ID for > the netns, and that one is in fact stable :S I would prefer that, but if that's not feasible then I can live without meaningful names. We'd just have to determine some way to get the address of a particular netns or netdev, which is an extra step when debugging, but one we can do (e.g., with drgn). -- Jeff Layton <jlayton@kernel.org>
On Thu, Apr 24, 2025 at 3:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
> But with all that said, I guess you still want the "meaningful" ID for
> the netns, and that one is in fact stable :S
We could add a stable id for net devices, and use it for this purpose
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0321fd952f70887735ac789d72f72948a3879832..339d09167eaf7f58fc877fa470c94175237ce534
100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2544,6 +2544,8 @@ struct net_device {
struct hwtstamp_provider __rcu *hwprov;
+ u64 permanent_id;
+
u8 priv[] ____cacheline_aligned
__counted_by(priv_len);
} ____cacheline_aligned;
diff --git a/net/core/dev.c b/net/core/dev.c
index d1a8cad0c99c47996e8bda44bf220266a5e51102..9d2d45e0246fab99ad3e7523885224bd114fa686
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10954,6 +10954,9 @@ int register_netdevice(struct net_device *dev)
{
int ret;
struct net *net = dev_net(dev);
+ static atomic64_t permanent_id;
+
+ dev->permanent_id = atomic64_inc_return(&permanent_id);
BUILD_BUG_ON(sizeof(netdev_features_t) * BITS_PER_BYTE <
NETDEV_FEATURE_COUNT);
On Thu, 2025-04-24 at 16:07 -0700, Eric Dumazet wrote:
> On Thu, Apr 24, 2025 at 3:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> > But with all that said, I guess you still want the "meaningful" ID for
> > the netns, and that one is in fact stable :S
>
> We could add a stable id for net devices, and use it for this purpose
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 0321fd952f70887735ac789d72f72948a3879832..339d09167eaf7f58fc877fa470c94175237ce534
> 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2544,6 +2544,8 @@ struct net_device {
>
> struct hwtstamp_provider __rcu *hwprov;
>
> + u64 permanent_id;
> +
> u8 priv[] ____cacheline_aligned
> __counted_by(priv_len);
> } ____cacheline_aligned;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d1a8cad0c99c47996e8bda44bf220266a5e51102..9d2d45e0246fab99ad3e7523885224bd114fa686
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10954,6 +10954,9 @@ int register_netdevice(struct net_device *dev)
> {
> int ret;
> struct net *net = dev_net(dev);
> + static atomic64_t permanent_id;
> +
> + dev->permanent_id = atomic64_inc_return(&permanent_id);
>
> BUILD_BUG_ON(sizeof(netdev_features_t) * BITS_PER_BYTE <
> NETDEV_FEATURE_COUNT);
That would give us uniqueness, but the name doesn't mean anything and
it wouldn't be trivial to track down which ref_tracker/netdev-* file
refers to which device.
If we're going to go that route, we're probably better off just
embedding the address in the name using a "netdev@%px" format.
--
Jeff Layton <jlayton@kernel.org>
© 2016 - 2026 Red Hat, Inc.