[PATCH v13 0/9] ref_tracker: add ability to register a debugfs file for a ref_tracker_dir

Jeff Layton posted 9 patches 4 months ago
There is a newer version of this series
drivers/gpu/drm/display/drm_dp_tunnel.c |   2 +-
drivers/gpu/drm/i915/intel_runtime_pm.c |   4 +-
drivers/gpu/drm/i915/intel_wakeref.c    |   3 +-
include/linux/ref_tracker.h             |  58 ++++++++++-
lib/ref_tracker.c                       | 176 +++++++++++++++++++++++++++++---
net/core/dev.c                          |   2 +-
net/core/net_namespace.c                |  34 +++++-
7 files changed, 253 insertions(+), 26 deletions(-)
[PATCH v13 0/9] ref_tracker: add ability to register a debugfs file for a ref_tracker_dir
Posted by Jeff Layton 4 months ago
For those just joining in, this series adds a new top-level
"ref_tracker" debugfs directory, and has each ref_tracker_dir register a
file in there as part of its initialization. It also adds the ability to
register a symlink with a more human-usable name that points to the
file, and does some general cleanup of how the ref_tracker object names
are handled.

This reposting is mostly to address Krzysztof's comments. I've dropped
the i915 patch, and rebased the rest of the series on top.

Note that I still see debugfs: warnings in the i915 driver even when we
gate the registration of the debugfs file on the classname pointer being
NULL. Here is a CI report from v12:

    https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_148490v8/bat-arls-6/igt@i915_selftest@live@workarounds.html

I think the i915 driver is doing something it shouldn't with these
objects. They seem to be initialized more than once, which could lead
to leaked ref_tracker objects. It would be good for one of the i915
maintainers to comment on whether this is a real problem.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v13:
- drop i915 patch
- Link to v12: https://lore.kernel.org/r/20250529-reftrack-dbgfs-v12-0-11b93c0c0b6e@kernel.org

Changes in v12:
- drop redundant pr_warn() calls. Debugfs already warns when these ops fail
- Link to v11: https://lore.kernel.org/r/20250528-reftrack-dbgfs-v11-0-94ae0b165841@kernel.org

Changes in v11:
- don't call ref_tracker_dir_init() more than once for same i915 objects
- use %llx in format for net_cookie in symlink name
- Link to v10: https://lore.kernel.org/r/20250527-reftrack-dbgfs-v10-0-dc55f7705691@kernel.org

Changes in v10:
- drop the i915 symlink patch
- Link to v9: https://lore.kernel.org/r/20250509-reftrack-dbgfs-v9-0-8ab888a4524d@kernel.org

Changes in v9:
- fix typo in ref_tracker_dir_init() kerneldoc header
- Link to v8: https://lore.kernel.org/r/20250507-reftrack-dbgfs-v8-0-607717d3bb98@kernel.org

Changes in v8:
- fix up compiler warnings that the KTR warned about
- ensure builds with CONFIG_DEBUG_FS=n and CONFIG_REF_TRACKER=y work
- Link to v7: https://lore.kernel.org/r/20250505-reftrack-dbgfs-v7-0-f78c5d97bcca@kernel.org

Changes in v7:
- include net->net_cookie in netns symlink name
- add __ostream_printf to ref_tracker_dir_symlink() stub function
- remove unneeded #include of seq_file.h
- Link to v6: https://lore.kernel.org/r/20250430-reftrack-dbgfs-v6-0-867c29aff03a@kernel.org

Changes in v6:
- clean up kerneldoc comment for ref_tracker_dir_debugfs()
- add missing stub function for ref_tracker_dir_symlink()
- temporary __maybe_unused on ref_tracker_dir_seq_print() to silence compiler warning
- Link to v5: https://lore.kernel.org/r/20250428-reftrack-dbgfs-v5-0-1cbbdf2038bd@kernel.org

Changes in v5:
- add class string to each ref_tracker_dir
- auto-register debugfs file for every tracker in ref_tracker_dir_init
- add function to allow adding a symlink for each tracker
- add patches to create symlinks for netns's and i915 entries
- change output format to print class@%p instead of name@%p
- eliminate the name field in ref_tracker_dir
- fix off-by-one bug when NULL terminating name string
- Link to v4: https://lore.kernel.org/r/20250418-reftrack-dbgfs-v4-0-5ca5c7899544@kernel.org

Changes in v4:
- Drop patch to widen ref_tracker_dir_.name, use NAME_MAX+1 (256) instead since this only affects dentry name
- Link to v3: https://lore.kernel.org/r/20250417-reftrack-dbgfs-v3-0-c3159428c8fb@kernel.org

Changes in v3:
- don't overwrite dir->name in ref_tracker_dir_debugfs
- define REF_TRACKER_NAMESZ and use it when setting name
- Link to v2: https://lore.kernel.org/r/20250415-reftrack-dbgfs-v2-0-b18c4abd122f@kernel.org

Changes in v2:
- Add patch to do %pK -> %p conversion in ref_tracker.c
- Pass in output function to pr_ostream() instead of if statement
- Widen ref_tracker_dir.name to 64 bytes to accomodate unique names
- Eliminate error handling with debugfs manipulation
- Incorporate pointer value into netdev name
- Link to v1: https://lore.kernel.org/r/20250414-reftrack-dbgfs-v1-0-f03585832203@kernel.org

---
Jeff Layton (9):
      ref_tracker: don't use %pK in pr_ostream() output
      ref_tracker: add a top level debugfs directory for ref_tracker
      ref_tracker: have callers pass output function to pr_ostream()
      ref_tracker: add a static classname string to each ref_tracker_dir
      ref_tracker: allow pr_ostream() to print directly to a seq_file
      ref_tracker: automatically register a file in debugfs for a ref_tracker_dir
      ref_tracker: add a way to create a symlink to the ref_tracker_dir debugfs file
      net: add symlinks to ref_tracker_dir for netns
      ref_tracker: eliminate the ref_tracker_dir name field

 drivers/gpu/drm/display/drm_dp_tunnel.c |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c |   4 +-
 drivers/gpu/drm/i915/intel_wakeref.c    |   3 +-
 include/linux/ref_tracker.h             |  58 ++++++++++-
 lib/ref_tracker.c                       | 176 +++++++++++++++++++++++++++++---
 net/core/dev.c                          |   2 +-
 net/core/net_namespace.c                |  34 +++++-
 7 files changed, 253 insertions(+), 26 deletions(-)
---
base-commit: 90b83efa6701656e02c86e7df2cb1765ea602d07
change-id: 20250413-reftrack-dbgfs-3767b303e2fa

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>
Re: [PATCH v13 0/9] ref_tracker: add ability to register a debugfs file for a ref_tracker_dir
Posted by Jakub Kicinski 4 months ago
On Tue, 03 Jun 2025 07:27:11 -0400 Jeff Layton wrote:
> For those just joining in, this series adds a new top-level
> "ref_tracker" debugfs directory, and has each ref_tracker_dir register a
> file in there as part of its initialization. It also adds the ability to
> register a symlink with a more human-usable name that points to the
> file, and does some general cleanup of how the ref_tracker object names
> are handled.
> 
> This reposting is mostly to address Krzysztof's comments. I've dropped
> the i915 patch, and rebased the rest of the series on top.
> 
> Note that I still see debugfs: warnings in the i915 driver even when we
> gate the registration of the debugfs file on the classname pointer being
> NULL. Here is a CI report from v12:
> 
>     https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_148490v8/bat-arls-6/igt@i915_selftest@live@workarounds.html
> 
> I think the i915 driver is doing something it shouldn't with these
> objects. They seem to be initialized more than once, which could lead
> to leaked ref_tracker objects. It would be good for one of the i915
> maintainers to comment on whether this is a real problem.

I still see the fs crashes:
https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/149560/2-tcp-slow-start-slow-start-app-limited-pkt/stderr
I'll hide this series from patchwork for now. We will pull from Linus
on Thu, I'll reactivate it and let's see if the problem was already
fixed.
Re: [PATCH v13 0/9] ref_tracker: add ability to register a debugfs file for a ref_tracker_dir
Posted by Jeff Layton 4 months ago
On Tue, 2025-06-03 at 19:29 -0700, Jakub Kicinski wrote:
> On Tue, 03 Jun 2025 07:27:11 -0400 Jeff Layton wrote:
> > For those just joining in, this series adds a new top-level
> > "ref_tracker" debugfs directory, and has each ref_tracker_dir register a
> > file in there as part of its initialization. It also adds the ability to
> > register a symlink with a more human-usable name that points to the
> > file, and does some general cleanup of how the ref_tracker object names
> > are handled.
> > 
> > This reposting is mostly to address Krzysztof's comments. I've dropped
> > the i915 patch, and rebased the rest of the series on top.
> > 
> > Note that I still see debugfs: warnings in the i915 driver even when we
> > gate the registration of the debugfs file on the classname pointer being
> > NULL. Here is a CI report from v12:
> > 
> >     https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_148490v8/bat-arls-6/igt@i915_selftest@live@workarounds.html
> > 
> > I think the i915 driver is doing something it shouldn't with these
> > objects. They seem to be initialized more than once, which could lead
> > to leaked ref_tracker objects. It would be good for one of the i915
> > maintainers to comment on whether this is a real problem.
> 
> I still see the fs crashes:
> https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/149560/2-tcp-slow-start-slow-start-app-limited-pkt/stderr
> I'll hide this series from patchwork for now. We will pull from Linus
> on Thu, I'll reactivate it and let's see if the problem was already
> fixed.

Sorry, I never got any mail about those failures. I would have looked
at that sooner. It looks like the last netns reference can be put in a
rcu callback? That makes sense.

I think ref_tracker_dir_exit() has to remain safe to call from any
context. Perhaps we can defer the dentry deletion to the system_wq?

We can drop this series for now. I'll have to think about this.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>
Re: [PATCH v13 0/9] ref_tracker: add ability to register a debugfs file for a ref_tracker_dir
Posted by Andrew Morton 4 months ago
On Tue, 3 Jun 2025 19:29:49 -0700 Jakub Kicinski <kuba@kernel.org> wrote:

> > I think the i915 driver is doing something it shouldn't with these
> > objects. They seem to be initialized more than once, which could lead
> > to leaked ref_tracker objects. It would be good for one of the i915
> > maintainers to comment on whether this is a real problem.
> 
> I still see the fs crashes:
> https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/149560/2-tcp-slow-start-slow-start-app-limited-pkt/stderr
> I'll hide this series from patchwork for now. We will pull from Linus
> on Thu, I'll reactivate it and let's see if the problem was already
> fixed.

Ah.  I dropped the copy from mm.git.