The stock kernel support partial lockless in handling in that iput() can
decrement any value > 1. Any ref acquire however requires the spinlock.
With this patchset ref acquires when the value was already at least 1
also become lockless. That is, only transitions 0->1 and 1->0 take the
lock.
I verified when nfs calls into the hash taking the lock is typically
avoided. Similarly, btrfs likes to igrab() and avoids the lock.
However, I have to fully admit I did not perform any benchmarks. While
cleaning stuff up I noticed lockless operation is almost readily
available so I went for it.
Clean-up wise, the icount_read_once() stuff lines up with inode_state_read_once().
The prefix is different but I opted to not change it due to igrab(), ihold() et al.
There is a future-proofing change in iput_final(). I am not going to
strongly insist on it, but at the very least the problem it sorts out
needs to be noted in a comment.
v5:
- reword some commentary
- add unlikely to the new icount check in iput_final()
v4:
- squash icount_read patches
- use icount_read_once in the new ihold assert, reported by syzbot
- squash lockless ref acquire patches, rewrite new comments
v3:
- tidy up ihold
- add lockless handling to the hash
Mateusz Guzik (4):
fs: add icount_read_once() and stop open-coding ->i_count loads
fs: relocate and tidy up ihold()
fs: handle potential filesystems which use I_DONTCACHE and drop the
lock in ->drop_inode
fs: allow lockless ->i_count bumps as long as it does not transition
0->1
arch/powerpc/platforms/cell/spufs/file.c | 2 +-
fs/btrfs/inode.c | 2 +-
fs/ceph/mds_client.c | 2 +-
fs/dcache.c | 4 +
fs/ext4/ialloc.c | 4 +-
fs/hpfs/inode.c | 2 +-
fs/inode.c | 122 ++++++++++++++++++-----
fs/nfs/inode.c | 4 +-
fs/smb/client/inode.c | 2 +-
fs/ubifs/super.c | 2 +-
fs/xfs/xfs_inode.c | 2 +-
fs/xfs/xfs_trace.h | 2 +-
include/linux/fs.h | 13 +++
include/trace/events/filelock.h | 2 +-
security/landlock/fs.c | 2 +-
15 files changed, 130 insertions(+), 37 deletions(-)
--
2.48.1