[PATCH 0/3] enumarated refcounts, for debugging refcount issues

Kent Overstreet posted 3 patches 9 months, 3 weeks ago
fs/bcachefs/Makefile                |   1 +
fs/bcachefs/alloc_background.c      |  41 ++++----
fs/bcachefs/backpointers.c          |   6 +-
fs/bcachefs/bcachefs.h              | 105 ++++++++++----------
fs/bcachefs/btree_gc.c              |   7 +-
fs/bcachefs/btree_io.c              |  29 +++---
fs/bcachefs/btree_node_scan.c       |  10 +-
fs/bcachefs/btree_trans_commit.c    |   5 +-
fs/bcachefs/btree_update_interior.c |   7 +-
fs/bcachefs/btree_write_buffer.c    |  11 ++-
fs/bcachefs/buckets.c               |   4 +-
fs/bcachefs/debug.c                 |  12 ++-
fs/bcachefs/ec.c                    |  32 ++++---
fs/bcachefs/enumerated_ref.c        | 144 ++++++++++++++++++++++++++++
fs/bcachefs/enumerated_ref.h        |  66 +++++++++++++
fs/bcachefs/enumerated_ref_types.h  |  19 ++++
fs/bcachefs/fs-io-direct.c          |   7 +-
fs/bcachefs/fs-io.c                 |  15 +--
fs/bcachefs/io_read.c               |  17 ++--
fs/bcachefs/io_write.c              |  20 ++--
fs/bcachefs/journal.c               |  10 +-
fs/bcachefs/journal_io.c            |  15 +--
fs/bcachefs/journal_reclaim.c       |   2 +-
fs/bcachefs/reflink.c               |   5 +-
fs/bcachefs/sb-members.h            |  32 ++++---
fs/bcachefs/snapshot.c              |   7 +-
fs/bcachefs/subvolume.c             |   7 +-
fs/bcachefs/super-io.c              |  18 ++--
fs/bcachefs/super.c                 | 125 +++++++++++-------------
fs/bcachefs/super.h                 |   3 +
fs/bcachefs/sysfs.c                 |  46 ++++-----
31 files changed, 540 insertions(+), 288 deletions(-)
create mode 100644 fs/bcachefs/enumerated_ref.c
create mode 100644 fs/bcachefs/enumerated_ref.h
create mode 100644 fs/bcachefs/enumerated_ref_types.h
[PATCH 0/3] enumarated refcounts, for debugging refcount issues
Posted by Kent Overstreet 9 months, 3 weeks ago
Not sure we have a list for library code, but this might be of interest
to anyone who's had to debug refcount issues on refs with lots of users
(filesystem people), and I know the hardening folks deal with refcounts
a lot.

In release mode, this is just a wrapper around percpu refcounts.

In debug mode, this provides separate sub-refcounts for each enumerated
user, and provides facilities for printing them.

Meaning, if you're debugging a refcount issue, you no longer have to go
searching through the entire codebase - it'll tell you the exact
codepath.

bcachefs patches are provided as example usage, for other subsystems.

Kent Overstreet (3):
  bcachefs: enumerated_ref.c
  bcachefs: bch_fs.writes -> enumerated_refs
  bcachefs: bch_dev.io_ref -> enumerated_ref

 fs/bcachefs/Makefile                |   1 +
 fs/bcachefs/alloc_background.c      |  41 ++++----
 fs/bcachefs/backpointers.c          |   6 +-
 fs/bcachefs/bcachefs.h              | 105 ++++++++++----------
 fs/bcachefs/btree_gc.c              |   7 +-
 fs/bcachefs/btree_io.c              |  29 +++---
 fs/bcachefs/btree_node_scan.c       |  10 +-
 fs/bcachefs/btree_trans_commit.c    |   5 +-
 fs/bcachefs/btree_update_interior.c |   7 +-
 fs/bcachefs/btree_write_buffer.c    |  11 ++-
 fs/bcachefs/buckets.c               |   4 +-
 fs/bcachefs/debug.c                 |  12 ++-
 fs/bcachefs/ec.c                    |  32 ++++---
 fs/bcachefs/enumerated_ref.c        | 144 ++++++++++++++++++++++++++++
 fs/bcachefs/enumerated_ref.h        |  66 +++++++++++++
 fs/bcachefs/enumerated_ref_types.h  |  19 ++++
 fs/bcachefs/fs-io-direct.c          |   7 +-
 fs/bcachefs/fs-io.c                 |  15 +--
 fs/bcachefs/io_read.c               |  17 ++--
 fs/bcachefs/io_write.c              |  20 ++--
 fs/bcachefs/journal.c               |  10 +-
 fs/bcachefs/journal_io.c            |  15 +--
 fs/bcachefs/journal_reclaim.c       |   2 +-
 fs/bcachefs/reflink.c               |   5 +-
 fs/bcachefs/sb-members.h            |  32 ++++---
 fs/bcachefs/snapshot.c              |   7 +-
 fs/bcachefs/subvolume.c             |   7 +-
 fs/bcachefs/super-io.c              |  18 ++--
 fs/bcachefs/super.c                 | 125 +++++++++++-------------
 fs/bcachefs/super.h                 |   3 +
 fs/bcachefs/sysfs.c                 |  46 ++++-----
 31 files changed, 540 insertions(+), 288 deletions(-)
 create mode 100644 fs/bcachefs/enumerated_ref.c
 create mode 100644 fs/bcachefs/enumerated_ref.h
 create mode 100644 fs/bcachefs/enumerated_ref_types.h

-- 
2.49.0
Re: [PATCH 0/3] enumarated refcounts, for debugging refcount issues
Posted by Kees Cook 9 months, 3 weeks ago
On Sun, Apr 20, 2025 at 11:59:13AM -0400, Kent Overstreet wrote:
> Not sure we have a list for library code, but this might be of interest
> to anyone who's had to debug refcount issues on refs with lots of users
> (filesystem people), and I know the hardening folks deal with refcounts
> a lot.

Why not use refcount_t instead of atomic_t?

-- 
Kees Cook
Re: [PATCH 0/3] enumarated refcounts, for debugging refcount issues
Posted by Kent Overstreet 9 months, 3 weeks ago
On Sun, Apr 20, 2025 at 06:08:41PM -0700, Kees Cook wrote:
> On Sun, Apr 20, 2025 at 11:59:13AM -0400, Kent Overstreet wrote:
> > Not sure we have a list for library code, but this might be of interest
> > to anyone who's had to debug refcount issues on refs with lots of users
> > (filesystem people), and I know the hardening folks deal with refcounts
> > a lot.
> 
> Why not use refcount_t instead of atomic_t?

It's rather pointless here since percpu refcounts don't (and can't)
support saturation, and atomic_long_t should always suffice - you'd have
to be doing something particularly bizarre for it not to, since
refcounts generally count things in memory.

Out of curiousity, has overflow of an atomic_long_t refcount ever been
observed?
Re: [PATCH 0/3] enumarated refcounts, for debugging refcount issues
Posted by Kees Cook 9 months, 3 weeks ago
On Sun, Apr 20, 2025 at 09:27:26PM -0400, Kent Overstreet wrote:
> On Sun, Apr 20, 2025 at 06:08:41PM -0700, Kees Cook wrote:
> > On Sun, Apr 20, 2025 at 11:59:13AM -0400, Kent Overstreet wrote:
> > > Not sure we have a list for library code, but this might be of interest
> > > to anyone who's had to debug refcount issues on refs with lots of users
> > > (filesystem people), and I know the hardening folks deal with refcounts
> > > a lot.
> > 
> > Why not use refcount_t instead of atomic_t?
> 
> It's rather pointless here since percpu refcounts don't (and can't)
> support saturation, and atomic_long_t should always suffice - you'd have
> to be doing something particularly bizarre for it not to, since
> refcounts generally count things in memory.

Ah yes, my eyes skipped over the "long" part when I was reading the
patches. There's currently no sane reason to use refcount_t when
already using atomic_long_t. Sorry for the noise!

> Out of curiousity, has overflow of an atomic_long_t refcount ever been
> observed?

Not to my knowledge. :)

-- 
Kees Cook
Re: [PATCH 0/3] enumarated refcounts, for debugging refcount issues
Posted by Kane York 9 months, 2 weeks ago
On Sun, 20 Apr 2025 20:09:50 -0700, Kees Cook wrote:
> On Sun, Apr 20, 2025 at 09:27:26PM -0400, Kent Overstreet wrote:
> > On Sun, Apr 20, 2025 at 06:08:41PM -0700, Kees Cook wrote:
> > > On Sun, Apr 20, 2025 at 11:59:13AM -0400, Kent Overstreet wrote:
> > > > Not sure we have a list for library code, but this might be of interest
> > > > to anyone who's had to debug refcount issues on refs with lots of users
> > > > (filesystem people), and I know the hardening folks deal with refcounts
> > > > a lot.
> > >
> > > Why not use refcount_t instead of atomic_t?
> > Out of curiousity, has overflow of an atomic_long_t refcount ever been
> > observed?
>
> Not to my knowledge. :)

Equivalent systems have observed it, but only in the presence of compiler
optimizations that deduce they could increment the refcount multiple times.

  NEVER_INLINE void naughty_ref_increment(ref* ref) {
    long i;
    for (i = 0; i < LONG_MAX/2; i++) {
      ref_get(ref);
    }
  }

Running the above code 3 times will saturate the refcount, if it ever
terminates in our lifetimes (due to being optimized into an
atomic_fetch_add(LONG_MAX/2)).

So: don't write the above code!