From nobody Wed Apr 1 09:43:47 2026 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3164E3FA5C0 for ; Tue, 31 Mar 2026 16:09:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973348; cv=none; b=pSka9tSCVu8R2s4J/oCkQBKwn/xgIKkt5y2PVU5HumHXj7NAuvqrXh7A40wY2Cf2mIh9oqANsAsiCM3e/25h8zSFFQNJac0ZLInoC2dxqlmZueefqJtvQLb/dMJ90e9s3vTvGcN0zz5A7RCYFmIVTQAjO1lUhpgK7YE+oXb9y7U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973348; c=relaxed/simple; bh=L1RFcgDnQ31NWDGIxtybwFx8VaMwRcRUNwVLv3wONAI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ttEFrgNTKIgX6pkaB9mtaIpKQZMBCOQFavKzfryIkMUimMSN8q+jKlcIiYdky1liuIPrbsgvSEfusmjZ2FT512Sq9yWkizeu6Fwf+R9yULjpQnBSMkFZcn7Qduie9I6HkY6CyYJG9OHKyKV2SpVE3p7MbqS1165poLW6l4KJk1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dE+bCi48; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dE+bCi48" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-43cfb723698so2026825f8f.3 for ; Tue, 31 Mar 2026 09:09:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774973345; x=1775578145; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tNGkc6mI/lF/b7D/ZXMwhX1tSYEj50reD2Mab1GCQ2M=; b=dE+bCi48+lmTNKp95lKJaLSCcBVJX/hDmY5gS/OIv0uh7FCB4hE5Nii9ob76ht6wx+ sO/Crvmo+f/fJbG4OVGoU2x+iCpRLBo3MvMG5TDHN/7lXzkS5z+uXE/5cIP6IO4DyVda CMPOsizBucYx+2nGJZJ9yKLfn8efebxM5/bGvwn3kiPMxQWk7GKvTXbtaBxy4Lc/Jf1B cayzAIObuJmDdYFPdmOr1Q2JJmm2+4RlFMAt1giunpzcahtn6dD+q6fteIMW7lxdJjQv i6JPxchzyNcI9MaYbw8YLB3vijJaTb1xqnzrJtklkF28VXOKCrfSEP517gs9uglX4TMM J0ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774973345; x=1775578145; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=tNGkc6mI/lF/b7D/ZXMwhX1tSYEj50reD2Mab1GCQ2M=; b=o+w6vAWJvLvto9NTBHioPSKygp6IZgGEWn0iNzCvX0mXloBPhfYEmoTBKywCIs9/O4 EDg7Ko2PXFyoCtnMmwJBlfWYUfxq+o6PeLa0dbiD/gFsjJFbgPqztGyFz/D6+GDuR0v7 T8br7Bsjk1dfN8+3TKi0o7jS0ExZYRaeW1kkwhNnJx+pmqlcyQHrh1zGI+pJnKLsx5SM jF2mLqed8obq7ohlkOTvakSUQ0Poe/TlTNO0tAzSZYxicStuifgQC0JHJem0Hb6QHStA WLpbNxKm/TH2lQ7/GHiB73jTn+l2/dRZLze6LLdDgNskzIkJ8ziYsmVPa0M/w0FQJq7/ 19wA== X-Forwarded-Encrypted: i=1; AJvYcCUN99ulmQzD6tZZ0QBeOd0d8LnluTi38+gvj7mS+VPByA1RvwOAcXRgwcjdmeaaIGQ5NfspjjBPTLumKOs=@vger.kernel.org X-Gm-Message-State: AOJu0YxkQMxmath2oXihAN4XfPGV1/tcjBXPhjlYAZm+S29wepRLqTTV 4T01QXHO4ndIk7b+Kz3CYWopdAMtP1MNMoW+zB4BTanBUWCJQOoxKmEQ X-Gm-Gg: ATEYQzzuwrz4XqcM/TT4rAqSJwH+nUOgnONoVjcm8cZpWlJOsVfJpJZSNevnC5ApIxo Lg/WFCows2FV+BjnVN25sxWWQT95xvqNWmr5hjeSa5fHKsY9o6PyWSSgyE7sBpyfzeNgSzbI2ap xH8cyiXh3JAU06/rXsJ5ADBp9ZsYdvdZ9QLNNPQ/8dY31oweOFjHg/2H0qTHWSZevkTATULigSv Ha70JjRpfV5qnbqPhAbk6VhWu6t/OvTLBSg9N0U/eAK5SeAPUI7ZNC4kMRGd8IF5eYMSO70i0eR +kp4IuN1XTnOanv+2GL1Xh8pb5nyR5PiGXYMpwXho8AkQAWh9hT/fNCfHjiZzI4QSGiTBtz3G8P 1fABlgZuzOJwqw/bY0GWFgUZUAoYxSfsQOFIjRek1P9am5GWkzyIFYt9mhmptg67TOzdtPwV6Ob YIGW3t78sOBgGKWicawoebZjTkASqMFc/+GkY9CnbpiXCG91dSz8rj3E8s7hEy+XDQnM7SsnMps w== X-Received: by 2002:a05:6000:2384:b0:43c:f5f2:292a with SMTP id ffacd0b85a97d-43d150485edmr409130f8f.4.1774973345094; Tue, 31 Mar 2026 09:09:05 -0700 (PDT) Received: from f.. (cst-prg-89-171.cust.vodafone.cz. [46.135.89.171]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21e3602sm28792632f8f.4.2026.03.31.09.09.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:09:04 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik Subject: [PATCH v5 1/4] fs: add icount_read_once() and stop open-coding ->i_count loads Date: Tue, 31 Mar 2026 18:08:48 +0200 Message-ID: <20260331160851.3854954-2-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260331160851.3854954-1-mjguzik@gmail.com> References: <20260331160851.3854954-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Similarly to inode_state_read_once(), it makes the caller spell out they acknowledge instability of the returned value. Signed-off-by: Mateusz Guzik --- arch/powerpc/platforms/cell/spufs/file.c | 2 +- fs/btrfs/inode.c | 2 +- fs/ceph/mds_client.c | 2 +- fs/ext4/ialloc.c | 4 ++-- fs/hpfs/inode.c | 2 +- fs/inode.c | 12 ++++++------ fs/nfs/inode.c | 4 ++-- fs/smb/client/inode.c | 2 +- fs/ubifs/super.c | 2 +- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_trace.h | 2 +- include/linux/fs.h | 13 +++++++++++++ include/trace/events/filelock.h | 2 +- security/landlock/fs.c | 2 +- 14 files changed, 33 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platfo= rms/cell/spufs/file.c index 10fa9b844fcc..f6de8c1169d5 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -1430,7 +1430,7 @@ static int spufs_mfc_open(struct inode *inode, struct= file *file) if (ctx->owner !=3D current->mm) return -EINVAL; =20 - if (icount_read(inode) !=3D 1) + if (icount_read_once(inode) !=3D 1) return -EBUSY; =20 mutex_lock(&ctx->mapping_lock); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8d97a8ad3858..f36c49e83c04 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4741,7 +4741,7 @@ static void btrfs_prune_dentries(struct btrfs_root *r= oot) =20 inode =3D btrfs_find_first_inode(root, min_ino); while (inode) { - if (icount_read(&inode->vfs_inode) > 1) + if (icount_read_once(&inode->vfs_inode) > 1) d_prune_aliases(&inode->vfs_inode); =20 min_ino =3D btrfs_ino(inode) + 1; diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index b1746273f186..2cb3c919d40d 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2223,7 +2223,7 @@ static int trim_caps_cb(struct inode *inode, int mds,= void *arg) int count; dput(dentry); d_prune_aliases(inode); - count =3D icount_read(inode); + count =3D icount_read_once(inode); if (count =3D=3D 1) (*remaining)--; doutc(cl, "%p %llx.%llx cap %p pruned, count now %d\n", diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 3fd8f0099852..8c80d5087516 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -252,10 +252,10 @@ void ext4_free_inode(handle_t *handle, struct inode *= inode) "nonexistent device\n", __func__, __LINE__); return; } - if (icount_read(inode) > 1) { + if (icount_read_once(inode) > 1) { ext4_msg(sb, KERN_ERR, "%s:%d: inode #%llu: count=3D%d", __func__, __LINE__, inode->i_ino, - icount_read(inode)); + icount_read_once(inode)); return; } if (inode->i_nlink) { diff --git a/fs/hpfs/inode.c b/fs/hpfs/inode.c index 0e932cc8be1b..1b4fcf760aad 100644 --- a/fs/hpfs/inode.c +++ b/fs/hpfs/inode.c @@ -184,7 +184,7 @@ void hpfs_write_inode(struct inode *i) struct hpfs_inode_info *hpfs_inode =3D hpfs_i(i); struct inode *parent; if (i->i_ino =3D=3D hpfs_sb(i->i_sb)->sb_root) return; - if (hpfs_inode->i_rddir_off && !icount_read(i)) { + if (hpfs_inode->i_rddir_off && !icount_read_once(i)) { if (*hpfs_inode->i_rddir_off) pr_err("write_inode: some position still there\n"); kfree(hpfs_inode->i_rddir_off); diff --git a/fs/inode.c b/fs/inode.c index 5ad169d51728..1f5a383ccf27 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -907,7 +907,7 @@ void evict_inodes(struct super_block *sb) again: spin_lock(&sb->s_inode_list_lock); list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { - if (icount_read(inode)) + if (icount_read_once(inode)) continue; =20 spin_lock(&inode->i_lock); @@ -1926,7 +1926,7 @@ static void iput_final(struct inode *inode) int drop; =20 WARN_ON(inode_state_read(inode) & I_NEW); - VFS_BUG_ON_INODE(atomic_read(&inode->i_count) !=3D 0, inode); + VFS_BUG_ON_INODE(icount_read(inode) !=3D 0, inode); =20 if (op->drop_inode) drop =3D op->drop_inode(inode); @@ -1945,7 +1945,7 @@ static void iput_final(struct inode *inode) * Re-check ->i_count in case the ->drop_inode() hooks played games. * Note we only execute this if the verdict was to drop the inode. */ - VFS_BUG_ON_INODE(atomic_read(&inode->i_count) !=3D 0, inode); + VFS_BUG_ON_INODE(icount_read(inode) !=3D 0, inode); =20 if (drop) { inode_state_set(inode, I_FREEING); @@ -1989,7 +1989,7 @@ void iput(struct inode *inode) * equal to one, then two CPUs racing to further drop it can both * conclude it's fine. */ - VFS_BUG_ON_INODE(atomic_read(&inode->i_count) < 1, inode); + VFS_BUG_ON_INODE(icount_read_once(inode) < 1, inode); =20 if (atomic_add_unless(&inode->i_count, -1, 1)) return; @@ -2023,7 +2023,7 @@ EXPORT_SYMBOL(iput); void iput_not_last(struct inode *inode) { VFS_BUG_ON_INODE(inode_state_read_once(inode) & (I_FREEING | I_CLEAR), in= ode); - VFS_BUG_ON_INODE(atomic_read(&inode->i_count) < 2, inode); + VFS_BUG_ON_INODE(icount_read_once(inode) < 2, inode); =20 WARN_ON(atomic_sub_return(1, &inode->i_count) =3D=3D 0); } @@ -3046,7 +3046,7 @@ void dump_inode(struct inode *inode, const char *reas= on) } =20 state =3D inode_state_read_once(inode); - count =3D atomic_read(&inode->i_count); + count =3D icount_read_once(inode); =20 if (!sb || get_kernel_nofault(s_type, &sb->s_type) || !s_type || diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 98a8f0de1199..22834eddd5b1 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -608,7 +608,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, st= ruct nfs_fattr *fattr) inode->i_sb->s_id, (unsigned long long)NFS_FILEID(inode), nfs_display_fhandle_hash(fh), - icount_read(inode)); + icount_read_once(inode)); =20 out: return inode; @@ -2261,7 +2261,7 @@ static int nfs_update_inode(struct inode *inode, stru= ct nfs_fattr *fattr) dfprintk(VFS, "NFS: %s(%s/%llu fh_crc=3D0x%08x ct=3D%d info=3D0x%llx)\n", __func__, inode->i_sb->s_id, inode->i_ino, nfs_display_fhandle_hash(NFS_FH(inode)), - icount_read(inode), fattr->valid); + icount_read_once(inode), fattr->valid); =20 if (!(fattr->valid & NFS_ATTR_FATTR_FILEID)) { /* Only a mounted-on-fileid? Just exit */ diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index 888f9e35f14b..ab35e35b16d7 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2842,7 +2842,7 @@ int cifs_revalidate_dentry_attr(struct dentry *dentry) } =20 cifs_dbg(FYI, "Update attributes: %s inode 0x%p count %d dentry: 0x%p d_t= ime %ld jiffies %ld\n", - full_path, inode, icount_read(inode), + full_path, inode, icount_read_once(inode), dentry, cifs_get_time(dentry), jiffies); =20 again: diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 9a77d8b64ffa..38972786817e 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -358,7 +358,7 @@ static void ubifs_evict_inode(struct inode *inode) goto out; =20 dbg_gen("inode %llu, mode %#x", inode->i_ino, (int)inode->i_mode); - ubifs_assert(c, !icount_read(inode)); + ubifs_assert(c, !icount_read_once(inode)); =20 truncate_inode_pages_final(&inode->i_data); =20 diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index beaa26ec62da..4f659eba6ae5 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1046,7 +1046,7 @@ xfs_itruncate_extents_flags( int error =3D 0; =20 xfs_assert_ilocked(ip, XFS_ILOCK_EXCL); - if (icount_read(VFS_I(ip))) + if (icount_read_once(VFS_I(ip))) xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL); if (whichfork =3D=3D XFS_DATA_FORK) ASSERT(new_size <=3D XFS_ISIZE(ip)); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 5e8190fe2be9..cbdec40826b3 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1156,7 +1156,7 @@ DECLARE_EVENT_CLASS(xfs_iref_class, TP_fast_assign( __entry->dev =3D VFS_I(ip)->i_sb->s_dev; __entry->ino =3D ip->i_ino; - __entry->count =3D icount_read(VFS_I(ip)); + __entry->count =3D icount_read_once(VFS_I(ip)); __entry->pincount =3D atomic_read(&ip->i_pincount); __entry->iflags =3D ip->i_flags; __entry->caller_ip =3D caller_ip; diff --git a/include/linux/fs.h b/include/linux/fs.h index 8afbe2ef2686..07363fce4406 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2225,8 +2225,21 @@ static inline void mark_inode_dirty_sync(struct inod= e *inode) __mark_inode_dirty(inode, I_DIRTY_SYNC); } =20 +/* + * returns the refcount on the inode. it can change arbitrarily. + */ +static inline int icount_read_once(const struct inode *inode) +{ + return atomic_read(&inode->i_count); +} + +/* + * returns the refcount on the inode. The lock guarantees no new references + * are added, but references can be dropped as long as the result is > 0. + */ static inline int icount_read(const struct inode *inode) { + lockdep_assert_held(&inode->i_lock); return atomic_read(&inode->i_count); } =20 diff --git a/include/trace/events/filelock.h b/include/trace/events/fileloc= k.h index 116774886244..c8c8847bb6f6 100644 --- a/include/trace/events/filelock.h +++ b/include/trace/events/filelock.h @@ -190,7 +190,7 @@ TRACE_EVENT(generic_add_lease, __entry->i_ino =3D inode->i_ino; __entry->wcount =3D atomic_read(&inode->i_writecount); __entry->rcount =3D atomic_read(&inode->i_readcount); - __entry->icount =3D icount_read(inode); + __entry->icount =3D icount_read_once(inode); __entry->owner =3D fl->c.flc_owner; __entry->flags =3D fl->c.flc_flags; __entry->type =3D fl->c.flc_type; diff --git a/security/landlock/fs.c b/security/landlock/fs.c index c1ecfe239032..32d560f12dbd 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -1278,7 +1278,7 @@ static void hook_sb_delete(struct super_block *const = sb) struct landlock_object *object; =20 /* Only handles referenced inodes. */ - if (!icount_read(inode)) + if (!icount_read_once(inode)) continue; =20 /* --=20 2.48.1 From nobody Wed Apr 1 09:43:47 2026 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DAE2423147 for ; Tue, 31 Mar 2026 16:09:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973349; cv=none; b=NhanE4dEhVfGSv1DoVJdTgVxoPvAbLV0pbY6bg4iJwAYR7CwDOGJCkZToV0C8N0HNlJPgXoSwoyjZVIw3BcWckNPnfq5ME+mqfN/Bt7LP5sLHTEp2yciPvW/DGrHQl59tz2gFP1WhZ2cBlc50TGY/8YdX/BRL05I98ezGAk4W7k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973349; c=relaxed/simple; bh=COKwH1VH8BuYCRynamMlTkaMzJ4AGwGGF/sBUjJNvDg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SKvxGujHDJ0OLF23mkZhhsaNvZHZOS15k4GwefI7+gMzq13TG/rm6OrT8aBpsLNMYCB0UjU8awwnAMXZMlBj07z29RCHQWWmwaVXXIgIcXKMhLddVbYCn+eItZW21rI+WufprNgA1qJs80EC+qoj0ZvDzFyK1Z5Wxwt5cgqClUg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nK5JtmnB; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nK5JtmnB" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-487035181a7so39643185e9.2 for ; Tue, 31 Mar 2026 09:09:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774973346; x=1775578146; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jBIfgeXR6ZzaVZZprJJKEW+34xpz6VJfBbaNi9NqWK8=; b=nK5JtmnB2FXyfa2L0bE/B/3tio3FiGu8WV0PuplkSKPBQDNwQHVKow5li07CJZ+cc9 rN5RMWv9ApX/2EIUdGHR5dv5jPzke0XGJ3AysFzvuGo/jRCHAQeCNjxmujtuXYpcPIBt MKasM7D3O8KD2R74blP/kHYc5j32dASuafkoJ4SauKKyU5EP8vGyEGSV2/kA0Zfdm4Tw 6CVf6nraoBXJDFi1dNpsaPHFR0+fXvXuNbuwf6N/IlpEgAmubGV/sf4NAxsmxxPr773N q9yx09yIfDOQ5zEMUgOZEtpqq2+Lci8RUHIhkLjwoOIwDCWIExTxqokrMeOWIu2Hqgv0 JQpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774973346; x=1775578146; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jBIfgeXR6ZzaVZZprJJKEW+34xpz6VJfBbaNi9NqWK8=; b=BMOaqbUSX2KpuDfb+uMb3y+knqVT2mO5V+aAZcv5UE4GxmatFX7RMPj2x+xkDWWzS4 iQUUNje269Askq+v9geEoeMXSWNPO8hcho7y/dZuRT4Q325E1eentc99Cz7/2Ic7kEhj I2GjijCHChhOThYk6XCHijr2n65zr7ke1cBahYQ8SwYoyNCJpPjtQHRuy6FL0paR8WUM lFf8n0gznZkoRrwikDOfEp11805V0iRrLY4lfx/Kl4Vvrv7wtG10acWhajPG5pI2rEFh 9SkLGpvPNdXKcyulS2piPy+9lTM2dIyIDXYdmfvH/C9hPeFNRXuYP7UnElOngJEVzFbL vv+g== X-Forwarded-Encrypted: i=1; AJvYcCV2caDJSJGZn7E93lI+Z89HRZ1Xs9GwGzJRlnEMpA+SaAm7BReECydOKSM446QpVzIXMKYghuR530WU2PE=@vger.kernel.org X-Gm-Message-State: AOJu0Yz+EmiZruGkeWRAYWJBGJjQjRetzaLqmKOtYVKo5D3+70PnfA0c +ja/+Qx1a74EB6YSWbMYbKC10R31l7Gl++KPhzeL44tMYdu3UVVL77qX X-Gm-Gg: ATEYQzxEkf4qjOBTe5Jepj8V4C4y0uLzv0hayONpO9SE5yBmRBNMQY3sjVRtFZS1qMC TnwjDnMdc8peQvnzQAT3sqRz1KVU2eYPXdSNq0a6+unepijEK8i1g6TZC5iASY49kaNyG8yky3l rKWVrnBEsSnD7SSHeSfOqZzx7GXsMkRWN7I7TQBKcY//LFYYiV8M17NMLIMw4/5vYX8rRx2IRna 7gfVjPdqClHC5wv7B4gEUrH+l9BEYsB+F8hE/3h4UJa8YGqnVjynVIFtsH5gPeLSessvHZMOsRo KCiZbkRR+HRaZNvrGKMxDqg6PEMB50lAy2n2M0xIjRaGWcLa7vHq9/bbtjjHsgxOaBw/YierfTG zfoVlRL5vNiTVX11h4xyORJsMHQg3Nh8+S+jyFhFH+9j4HFa9VpCE8ylLVFLGk1Cza4FRceuv/e 7yThm1ZOXSh7qPninzjynKID+81qzAxq/SItd6zbST7T+S9P35Ud0zzQNLEhRG4kuL6FlU5E2Ew Q== X-Received: by 2002:a05:600c:4f91:b0:486:fb69:4960 with SMTP id 5b1f17b1804b1-48727edde49mr274770515e9.19.1774973346275; Tue, 31 Mar 2026 09:09:06 -0700 (PDT) Received: from f.. (cst-prg-89-171.cust.vodafone.cz. [46.135.89.171]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21e3602sm28792632f8f.4.2026.03.31.09.09.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:09:05 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik Subject: [PATCH v5 2/4] fs: relocate and tidy up ihold() Date: Tue, 31 Mar 2026 18:08:49 +0200 Message-ID: <20260331160851.3854954-3-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260331160851.3854954-1-mjguzik@gmail.com> References: <20260331160851.3854954-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The placement was illogical, move it next to igrab(). Take this opportunity to add docs and an assert on the refcount. While its modification remains gated with a WARN_ON, the new assert will also dump the inode state which might aid debugging. No functional changes. Signed-off-by: Mateusz Guzik --- fs/inode.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 1f5a383ccf27..e397a4b56671 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -522,15 +522,6 @@ static void init_once(void *foo) inode_init_once(inode); } =20 -/* - * get additional reference to inode; caller must already hold one. - */ -void ihold(struct inode *inode) -{ - WARN_ON(atomic_inc_return(&inode->i_count) < 2); -} -EXPORT_SYMBOL(ihold); - struct wait_queue_head *inode_bit_waitqueue(struct wait_bit_queue_entry *w= qe, struct inode *inode, u32 bit) { @@ -1578,6 +1569,17 @@ ino_t iunique(struct super_block *sb, ino_t max_rese= rved) } EXPORT_SYMBOL(iunique); =20 +/** + * ihold - get a reference on the inode, provided you already have one + * @inode: inode to operate on + */ +void ihold(struct inode *inode) +{ + VFS_BUG_ON_INODE(icount_read_once(inode) < 1, inode); + WARN_ON(atomic_inc_return(&inode->i_count) < 2); +} +EXPORT_SYMBOL(ihold); + struct inode *igrab(struct inode *inode) { spin_lock(&inode->i_lock); --=20 2.48.1 From nobody Wed Apr 1 09:43:47 2026 Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 402AE423A66 for ; Tue, 31 Mar 2026 16:09:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973350; cv=none; b=s6btNoIHbqXCdF9KYERbpEkBTPNVqsI24dpkIrVgZpIYXRaoxHlLGsBgyGLHdcPR0U4vwo4jW83A17EbhfS+k5oy7pkYh+yMzqjXkkaTsLCwjkZxutm8fP7WGFCHqOYxPWN8y5Q9wcDr5AdbxtzdXoUYHSvWNceMIu1APqS0YiA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973350; c=relaxed/simple; bh=Kx0hCzlQLvB2ZwWtgXSqQFrn1UOzjSBANutOmgJEla8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VBvzIwPtC6qDv39ji/gDOdC47aiZGIlhzPtRAvxA/iu/mCTnC+okFS2Dw+K/OqAJyCVkxYtMt7m2F3MRyqTM25uFxILYZTZsbkO/4M1CuTU5oWD7R5v1CCeGXBIojAebXsdzhtQABfpleyqStNkjs4mSYnyT82tJhbEQS3qt0fI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=g8FSt3T0; arc=none smtp.client-ip=209.85.221.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="g8FSt3T0" Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-43cf73bbfbdso1815075f8f.1 for ; Tue, 31 Mar 2026 09:09:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774973347; x=1775578147; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IAjANhRSSjFG14Tt4RrpUuJ/1k05o3tXfxT9ZI8DWb4=; b=g8FSt3T02fI/YQZMCiL5sHjiSYS2HMRjztBx5W9JznBNx/qYnBk0AATWFKwcc7kVoQ Rxl7Hf3I9MEWXJESRL8MhAXh57z+yulRFzlaRvHrk1ZHewV3XwtMxZYaXAzp/6WIwaR/ riGP2hMDgOKPHhaVA/c2ytOSiwQAlyM5UHC/dvCkSNM9nEU21U4G60PehDoKYtY8LMtq QCOtQcuAuUC9+IB47OUMJg9+/MpAGhRnm49aiaZyy3sseYbumsGTfInqKV13N2zzxU8z CHzH6YXHeBtmsDG1ChFMO4IbJSLKoyvhZgt1hWhq7zOTX8XmJQTlw0Bq2s5IwnebK/lF R+fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774973347; x=1775578147; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=IAjANhRSSjFG14Tt4RrpUuJ/1k05o3tXfxT9ZI8DWb4=; b=JVJc04KsuW9iSAxdD68GqW98w8FTIQGvwoGebxc6CDvii968NChUJEHRx6Ucxs55lw ZpWVOOFEJ1MQRlj8mom2yAtYV24wAXo40PtrTgTcJjv7a4+Cz9rgrkLk7oLj79Hky33m hgjc8xRXCJPqJoSqi9txpL26GJNPx/tu0KkfqUTT+uSaQykun6Lvj/aZhrzdrdUYf742 AmeRNXtgsOalr6sNSbPM99BVh8EnSWJTEACfLBjZ8v2oQFapHqLqyanRoTzxw5dfANzc 9Zme7cTE125FHW3XuTYho+Tcd+xhqcFvSQpe97KeIJahMGeYSqVAa51MCCu/k0vIIwp/ GYUw== X-Forwarded-Encrypted: i=1; AJvYcCVc6vbTrJDGfli+zRyCkldv09sOu6NNHDrKe5QuGf9MgFi7nmcxQ2ONRkBnU0SGQCOTdt9oPEl9aujGP0c=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5hb8VVH2RgRYXF4fq3uu3Vz8l/8XvTKSLlYP0mlDfJIgL38CT k7bp9/RqULGUI1SJZb1JQwgR6fnxxirVZDkYV+zrJNy1+e/uXYuM3N5XZ7/ZiQ== X-Gm-Gg: ATEYQzwvuUvpq37ZyZOfHX0qXJ106uASBr+E21Xv+qZ0osdUiBcYe4zsKRCQhV7XEFL ocm0PMFQeMQLnJxhDD6SY9BjRY0h+TV2Q6MhMt5Wh9YKd0BctTIPJ6drP9w6sKEbreruF+ynTJ4 keDZUpiBgy2AHRDe3iCxzkO8LsbuKYUJD+lZoYJD3m31h2x8w1nG88cypr6gZz/WkGwpJ57Ogo5 9NRO2vEVSajYoo/A8ibF4cRe3IX1znWtbQXyE9X9trB0Llkx39LetD2p+ATnGUv/ddDzxxOt7m/ fROfoe3FrG0Grl3B7E+1jM7p7iqgqa9kb5d9BZ8/E6ui94RIEwLtmousf5Q6RrQzMOUQ6uHq/9s EVEsC945epcpzXfwP7nfB29sFjSEhIEVeIXkkZW2yMQMoXJ9YILrf/RaWLX3EOyrI4SM3q6+HDP g6n/X8Eto5ODw5Qkgmzo6QKAg34SbZIJberpTv3tNEa7rWpZAPtZHOJf3EEfdJELhRw3kOpIwjp w== X-Received: by 2002:a05:6000:3106:b0:43b:45f5:eec with SMTP id ffacd0b85a97d-43d1504d2c0mr401055f8f.4.1774973347360; Tue, 31 Mar 2026 09:09:07 -0700 (PDT) Received: from f.. (cst-prg-89-171.cust.vodafone.cz. [46.135.89.171]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21e3602sm28792632f8f.4.2026.03.31.09.09.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:09:06 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik Subject: [PATCH v5 3/4] fs: handle potential filesystems which use I_DONTCACHE and drop the lock in ->drop_inode Date: Tue, 31 Mar 2026 18:08:50 +0200 Message-ID: <20260331160851.3854954-4-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260331160851.3854954-1-mjguzik@gmail.com> References: <20260331160851.3854954-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" f2fs and ntfs play games where they transitioning the refcount 0->1 and rel= ease the inode spinlock, allowing other threads to grab a ref of their own. They also return 0 in that case, making this problem harmless. However, should they start using the I_DONTCACHE machinery down the road while retaining the above, iput_final() will get a race where it can proceed to teardown an inode with references. The easiest way out for the time being is to future-proof it by predicating caching on the count. Developing better ->drop_inode semantics and sanitizing all users is left as en exercise for the reader. Signed-off-by: Mateusz Guzik --- fs/inode.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index e397a4b56671..013470e6d144 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1935,20 +1935,29 @@ static void iput_final(struct inode *inode) else drop =3D inode_generic_drop(inode); =20 - if (!drop && - !(inode_state_read(inode) & I_DONTCACHE) && - (sb->s_flags & SB_ACTIVE)) { + /* + * FIXME: there are ->drop_inode hooks playing nasty games releasing the + * spinlock and temporarily grabbing refs, in turn opening a possibility + * someone else will sneak in and grab a ref while it happens. + * + * If such a hook returns 0 (=3D=3D don't drop) it ends being up harmless= as long + * as the inode is not marked with I_DONTCACHE. Otherwise we are proceedi= ng + * with teardown despite references being present. + * + * Damage-control the problem by including the count in the decision. How= ever, + * assert no refs showed up if the hook decided to drop the inode. + */ + if (drop) + VFS_BUG_ON_INODE(icount_read(inode) !=3D 0, inode); + + if (unlikely(icount_read(inode) > 0) || + (!drop && !(inode_state_read(inode) & I_DONTCACHE) && + (sb->s_flags & SB_ACTIVE))) { __inode_lru_list_add(inode, true); spin_unlock(&inode->i_lock); return; } =20 - /* - * Re-check ->i_count in case the ->drop_inode() hooks played games. - * Note we only execute this if the verdict was to drop the inode. - */ - VFS_BUG_ON_INODE(icount_read(inode) !=3D 0, inode); - if (drop) { inode_state_set(inode, I_FREEING); } else { --=20 2.48.1 From nobody Wed Apr 1 09:43:47 2026 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A60E421F1F for ; Tue, 31 Mar 2026 16:09:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973352; cv=none; b=tUcr/FZ3WcmUFnkLMwgOOisT/mD3mnle9Lo1C9paFew1J8qfda0XE0sE92oouMq5zPRk61scNpwnQrahFUFGKK0ADeozT0IxPSm36vd6bNQ7BC6KO0CJG7XH5IbaIPl8/Ori5x3ob6vv7Ee1pQyncg6Qcu5lDljA/SuMHO1b8Rs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774973352; c=relaxed/simple; bh=xLe7Xr4y/3FYwrBKdU9rBcEBhztw5Png+eE/YUQOL1E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WDN2osQGm+N3NzvR/O5d2TeO9KDKsJA1hH3wY5KFaC3c7B1DS7blOZgjFkppv8/L5cXYcA+hjouQzYvxeH6ugeDNZzWdQUN038vZd9D+iQsaNswu50fVosHlHWE+cZZssqSkqCqyNfWYt8kcr8jlc3MxeoxURlHkGEOCIe9jVsg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DOMpYfCE; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DOMpYfCE" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-439d8df7620so4171581f8f.0 for ; Tue, 31 Mar 2026 09:09:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774973349; x=1775578149; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HSqZSq7PMnW1S3SVm/ldyegcUrreaAeM2ZHCXOJzrgI=; b=DOMpYfCEdP5j1HAU70ogTBiVPDb/mXo4GS157dHxTg1DjrQ5pHd45zJPp29U1rUbIt CiJznxTqGpsuraP6zBoYhvZqZSb1eeHiKgNLYC20Q3B1UPdrw0ITsgTiGxKzYsDDx9NA 4VbfaWDKId15DIRYRFcS+H/WcaQOvhJD1Nk7vkq/gef+23Wa2DwigK1VMlzvP5szhCAZ GI/QEkRYMPB8lxw+dzHZbobJ3RG66fRWRKtIA/N/FHF4gbQm/kRwvC52f8pw++yd3JOk S91Py92EiR22nHciMKDwzde3QKt8k+wDGPpKGurowFb/TkjIJEYGhfhJfEQsJ5UL49/H 1HRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774973349; x=1775578149; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=HSqZSq7PMnW1S3SVm/ldyegcUrreaAeM2ZHCXOJzrgI=; b=AVb100+aQVkS3vZck8/j4nlqSOgd5fzaSIgPS9BwyDOjsUOUhzsSB/xt0n8YTbttPw MGe+hwbQcrWOvUMuHwearprm6Z0ft8UquxFUNwpjWGFTPwD9PnlXlIKrcR/2tfzxzMJ0 jVkJ4/GRzmfxplhFATdhDylAhSiawn+HK2XCo5j9Th/CSqQKBicG1Ofz8YRdqwkOVZNH SY6mLgyavWN6ZPDmAZ4zzXUOgP/aT4o8rPYK4tsd3HTeS9+4nyLHXWvjO3zX1mUHpUzG YbTRZIf4gUV7P/m6SZNW0eh1RVdhOc0XaBQuy9bK+WFKcy7ZwSBb0LWbKjaD21ehJDjm gVfA== X-Forwarded-Encrypted: i=1; AJvYcCWhG6RKf1CqFyXMk/p/3FGXL/ftSC2aUdVGz8JjMdY14vFKHEd/xf/dZf/dHGviVCfP6tJv0Pn6ZUUDVUs=@vger.kernel.org X-Gm-Message-State: AOJu0YzIGFw3+WCG3tMnzx2FtLxYDzr7AsjRSsKv4MIarfpN4THaxffX NW6Ggk/Ar+Prrg8HSec+LLePkRGhDQvv0mdN2zlrA4GEYB5W4s/VKFfo8HsU7g== X-Gm-Gg: ATEYQzwheHuv4l3nkQcIlFmCv8tpLvk5AOoAmD3b2ybYKtv3wExAPrdbG2lqc3eSe1d S5CUEWRjByTLxJy3MkLPn4qyjF9NlmUbPjirdvKJ03ZQQvfqi6wQUf7jfP8JqwKpE7E2pMNwqLY i14E3JpRA5qt7W4yNwswWl95V/QFSmsedGRAYklb4AGADcdXx3/P2yrxFdzZ3YgFQ0tEVdKTEjZ JqJ9jhO0zmbZ91sV3mqEfTBYLbYCCkqn8DF/aopAKTbwxCYJYDkn7IOhAV62KAjPB3+Tb8fLLy5 9GA88bp7MyROiFDtbBuMQK3xaQT1ibWkq2nFTLq+AZkXEzFw4i+N6yxj9P7HIHQQiE6fzCIr6nH X31OM7xx0/dMtq2J4gr5d+L5z4LeiRpgXOloEciwQ0DhXuu8zFLoZY6pes08vRVxxyEnR4j3msk hagPMedLXikRlx1gFMEmQPc/cXG3K/eqF2FiHrYKCmdeKjrUp8/0r3ZGWsGgZYGjfVC/ku1RfBX Q== X-Received: by 2002:a05:6000:220f:b0:43c:f8b4:e58 with SMTP id ffacd0b85a97d-43d151107bdmr389813f8f.41.1774973348468; Tue, 31 Mar 2026 09:09:08 -0700 (PDT) Received: from f.. (cst-prg-89-171.cust.vodafone.cz. [46.135.89.171]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21e3602sm28792632f8f.4.2026.03.31.09.09.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:09:07 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik Subject: [PATCH v5 4/4] fs: allow lockless ->i_count bumps as long as it does not transition 0->1 Date: Tue, 31 Mar 2026 18:08:51 +0200 Message-ID: <20260331160851.3854954-5-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260331160851.3854954-1-mjguzik@gmail.com> References: <20260331160851.3854954-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With this change only 0->1 and 1->0 transitions need the lock. I verified all places which look at the refcount either only care about it staying 0 (and have the lock enforce it) or don't hold the inode lock to begin with (making the above change irrelevant to their correcness or lack thereof). I also confirmed nfs and btrfs like to call into these a lot and now avoid the lock in the common case, shaving off some atomics. Signed-off-by: Mateusz Guzik --- fs/dcache.c | 4 +++ fs/inode.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++ include/linux/fs.h | 4 +-- 3 files changed, 71 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 9ceab142896f..b63450ebb85c 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2033,6 +2033,10 @@ void d_instantiate_new(struct dentry *entry, struct = inode *inode) __d_instantiate(entry, inode); spin_unlock(&entry->d_lock); WARN_ON(!(inode_state_read(inode) & I_NEW)); + /* + * Paired with igrab_try_lockless() + */ + smp_wmb(); inode_state_clear(inode, I_NEW | I_CREATING); inode_wake_up_bit(inode, __I_NEW); spin_unlock(&inode->i_lock); diff --git a/fs/inode.c b/fs/inode.c index 013470e6d144..03472be4e1a9 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1029,6 +1029,7 @@ long prune_icache_sb(struct super_block *sb, struct s= hrink_control *sc) } =20 static void __wait_on_freeing_inode(struct inode *inode, bool hash_locked,= bool rcu_locked); +static bool igrab_try_lockless(struct inode *inode); =20 /* * Called with the inode lock held. @@ -1053,6 +1054,11 @@ static struct inode *find_inode(struct super_block *= sb, continue; if (!test(inode, data)) continue; + if (igrab_try_lockless(inode)) { + rcu_read_unlock(); + *isnew =3D false; + return inode; + } spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { __wait_on_freeing_inode(inode, hash_locked, true); @@ -1095,6 +1101,11 @@ static struct inode *find_inode_fast(struct super_bl= ock *sb, continue; if (inode->i_sb !=3D sb) continue; + if (igrab_try_lockless(inode)) { + rcu_read_unlock(); + *isnew =3D false; + return inode; + } spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { __wait_on_freeing_inode(inode, hash_locked, true); @@ -1212,6 +1223,10 @@ void unlock_new_inode(struct inode *inode) lockdep_annotate_inode_mutex_key(inode); spin_lock(&inode->i_lock); WARN_ON(!(inode_state_read(inode) & I_NEW)); + /* + * Paired with igrab_try_lockless() + */ + smp_wmb(); inode_state_clear(inode, I_NEW | I_CREATING); inode_wake_up_bit(inode, __I_NEW); spin_unlock(&inode->i_lock); @@ -1223,6 +1238,10 @@ void discard_new_inode(struct inode *inode) lockdep_annotate_inode_mutex_key(inode); spin_lock(&inode->i_lock); WARN_ON(!(inode_state_read(inode) & I_NEW)); + /* + * Paired with igrab_try_lockless() + */ + smp_wmb(); inode_state_clear(inode, I_NEW); inode_wake_up_bit(inode, __I_NEW); spin_unlock(&inode->i_lock); @@ -1582,6 +1601,14 @@ EXPORT_SYMBOL(ihold); =20 struct inode *igrab(struct inode *inode) { + /* + * Read commentary above igrab_try_lockless() for an explanation why this= works. + */ + if (atomic_add_unless(&inode->i_count, 1, 0)) { + VFS_BUG_ON_INODE(inode_state_read_once(inode) & (I_FREEING | I_WILL_FREE= ), inode); + return inode; + } + spin_lock(&inode->i_lock); if (!(inode_state_read(inode) & (I_FREEING | I_WILL_FREE))) { __iget(inode); @@ -1599,6 +1626,44 @@ struct inode *igrab(struct inode *inode) } EXPORT_SYMBOL(igrab); =20 +/* + * igrab_try_lockless - special inode refcount acquire primitive for the i= node hash + * (don't use elsewhere!) + * + * It provides lockless refcount acquire in the common case of no problema= tic + * flags being set and the count being > 0. + * + * There are 4 state flags to worry about and the routine makes sure to no= t bump the + * ref if any of them is present. + * + * I_NEW and I_CREATING can only legally get set *before* the inode become= s visible + * during lookup. Thus if the flags are not spotted, they are guaranteed t= o not be + * a factor. However, we need an acquire fence before returning the inode = just + * in case we raced against clearing the state to make sure our consumer p= icks up + * any other changes made prior. atomic_add_unless provides a full fence, = which + * takes care of it. + * + * I_FREEING and I_WILL_FREE can only legally get set if ->i_count =3D=3D = 0 and it is + * illegal to bump the ref if either is present. Consequently if atomic_ad= d_unless + * managed to replace a non-0 value with a bigger one, we have a guarantee= neither + * of these flags is set. Note this means explicitly checking of these fla= gs below + * is not necessary, it is only done because it does not cost anything on = top of the + * load which already needs to be done to handle the other flags. + */ +static bool igrab_try_lockless(struct inode *inode) +{ + if (inode_state_read_once(inode) & (I_NEW | I_CREATING | I_FREEING | I_WI= LL_FREE)) + return false; + /* + * Paired with routines clearing I_NEW + */ + if (atomic_add_unless(&inode->i_count, 1, 0)) { + VFS_BUG_ON_INODE(inode_state_read_once(inode) & (I_FREEING | I_WILL_FREE= ), inode); + return true; + } + return false; +} + /** * ilookup5_nowait - search for an inode in the inode cache * @sb: super block of file system to search diff --git a/include/linux/fs.h b/include/linux/fs.h index 07363fce4406..119e0a3d2f42 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2234,8 +2234,8 @@ static inline int icount_read_once(const struct inode= *inode) } =20 /* - * returns the refcount on the inode. The lock guarantees no new references - * are added, but references can be dropped as long as the result is > 0. + * returns the refcount on the inode. The lock guarantees no 0->1 or 1->0 = transitions + * of the count are going to take place, otherwise it changes arbitrarily. */ static inline int icount_read(const struct inode *inode) { --=20 2.48.1