New APIs for name lookup and lock for directory operations

[PATCH 0/7 RFC] New APIs for name lookup and lock for directory operations

Posted by NeilBrown 6 months, 3 weeks ago

Hi,

 these patches (against vfs.all) primarily introduce new APIs for
 preparing dentries for create, remove, rename.  The goal is to
 centralise knowledge of how we do locking (currently by locking the
 directory) so that we can eventually change the mechanism (e.g.  to
 locking just the dentry).

 Naming is difficult and I've changed my mind several times. :-)

 The basic approach is to return a dentry which can be passed to
 vfs_create(), vfs_unlink() etc, and subsequently to release that
 dentry.  The closest analogue to this in the VFS is kern_path_create()
 which is paired with done_path_create(), though there is also
 kern_path_locked() which is paired with explicit inode_unlock() and
 dput().  So my current approach uses "done_" for finishing up.

 I have:
   dentry_lookup() dentry_lookup_noperm() dentry_lookup_hashed()
   dentry_lookup_killable()
 paired with
   done_dentry_lookup()

 and also
   rename_lookup() rename_lookup_noperm() rename_lookup_hashed()
 paired with
   done_rename_lookup()
 (these take a "struct renamedata *" to which some qstrs are added.

 There is also "dentry_lock_in()" which is used instead of
 dentry_lookup() when you already have the dentry and want to lock it.
 So you "lock" it "in" a given parent.  I'm not very proud of this name,
 but I don't want to use "dentry_lock" as I want to save that for
 low-level locking primitives.

 There is also done_dentry_lookup_return() which doesn't dput() the
 dentry but returns it instread.  In about 1/6 of places where I need
 done_dentry_lookup() the code makes use of the dentry afterwards.  Only
 in half the places where done_dentry_lookup_return() is used is the
 returned value immediately returned by the calling function.  I could
 do a dget() before done_dentry_lookup(), but that looks awkward and I
 think having the _return version is justified.  I'm happy to hear other
 opinions.

 In order for this dentry-focussed API to work we need to have the
 dentry to unlock.  vfs_rmdir() currently consumes the dentry on
 failure, so we don't have it unless we clumsily keep a copy.  So an
 early patch changes vfs_rmdir() to both consume the dentry and drop the
 lock on failure.

 After these new APIs are refined, agreed, and applied I will have a
 collection of patches to roll them out throughout the kernel.  Then we
 can start/continue discussing a new approach to locking which allows
 directory operations to proceed in parallel.

 If you want a sneak peek at some of this future work - for context
 mostly - my current devel code is at https://github.com/neilbrown/linux.git
 in a branch "pdirops".  Be warned that a lot of the later code is under
 development, is known to be wrong, and doesn't even compile.  Not today
 anyway.  The rolling out of the new APIs is fairly mature though.

 Please review and suggest better names, or tell me that my choices are adequate.
 And find the bugs in the code too :-)

 I haven't cc:ed the maintains of the non-VFS code that the patches
 touch.  I can do that once the approach and names have been approved.

Thanks,
NeilBrown


 [PATCH 1/7] VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
 [PATCH 2/7] VFS: introduce done_dentry_lookup()
 [PATCH 3/7] VFS: Change vfs_mkdir() to unlock on failure.
 [PATCH 4/7] VFS: introduce dentry_lookup() and friends
 [PATCH 5/7] VFS: add dentry_lookup_killable()
 [PATCH 6/7] VFS: add rename_lookup()
 [PATCH 7/7] VFS: introduce dentry_lock_in()

Re: [PATCH 0/7 RFC] New APIs for name lookup and lock for directory operations

Posted by Amir Goldstein 6 months, 3 weeks ago

On Mon, Jul 21, 2025 at 10:46 AM NeilBrown <neil@brown.name> wrote:
>
> Hi,
>
>  these patches (against vfs.all) primarily introduce new APIs for
>  preparing dentries for create, remove, rename.  The goal is to
>  centralise knowledge of how we do locking (currently by locking the
>  directory) so that we can eventually change the mechanism (e.g.  to
>  locking just the dentry).
>
>  Naming is difficult and I've changed my mind several times. :-)

Indeed it is.
I generally like the done_ approach that you took.
Few minor naming comments follow.

>
>  The basic approach is to return a dentry which can be passed to
>  vfs_create(), vfs_unlink() etc, and subsequently to release that
>  dentry.  The closest analogue to this in the VFS is kern_path_create()
>  which is paired with done_path_create(), though there is also
>  kern_path_locked() which is paired with explicit inode_unlock() and
>  dput().  So my current approach uses "done_" for finishing up.
>
>  I have:
>    dentry_lookup() dentry_lookup_noperm() dentry_lookup_hashed()

As I wrote on the patch that introduces them I find dentry_lookup_hashed()
confusing because the dentry is not hashed (only the hash is calculated).

Looking at another precedent of _noperm() vfs API we have:

vfs_setxattr()
  __vfs_setxattr_locked()
    __vfs_setxattr_noperm()
      __vfs_setxattr()

Do I'd say for lack of better naming __dentry_lookup() could makes sense
for the bare lock&dget and it could also be introduced earlier along with
introducing done_dentry_lookup()

>    dentry_lookup_killable()
>  paired with
>    done_dentry_lookup()
>
>  and also
>    rename_lookup() rename_lookup_noperm() rename_lookup_hashed()
>  paired with
>    done_rename_lookup()
>  (these take a "struct renamedata *" to which some qstrs are added.
>
>  There is also "dentry_lock_in()" which is used instead of
>  dentry_lookup() when you already have the dentry and want to lock it.
>  So you "lock" it "in" a given parent.  I'm not very proud of this name,
>  but I don't want to use "dentry_lock" as I want to save that for
>  low-level locking primitives.

Very strange name :)

What's wrong with dentry_lock_parent()?

Although I think that using the verb _lock_ for locking and dget is
actively confusing, so something along the lines of
resume_dentry_lookup()/dentry_lookup_reacquire() might serve the
readers of the code better.

>
>  There is also done_dentry_lookup_return() which doesn't dput() the
>  dentry but returns it instread.  In about 1/6 of places where I need
>  done_dentry_lookup() the code makes use of the dentry afterwards.  Only
>  in half the places where done_dentry_lookup_return() is used is the
>  returned value immediately returned by the calling function.  I could
>  do a dget() before done_dentry_lookup(), but that looks awkward and I
>  think having the _return version is justified.  I'm happy to hear other
>  opinions.

The name is not very descriptive IMO, but I do not have a better suggestion.
Unless you can describe it for the purpose that it is used for, e.g.
yeild_dentry_lookup() that can be followed with resume_dentry_lookup(),
but I do not know if those are your intentions for the return API.

Thanks,
Amir.

>
>  In order for this dentry-focussed API to work we need to have the
>  dentry to unlock.  vfs_rmdir() currently consumes the dentry on
>  failure, so we don't have it unless we clumsily keep a copy.  So an
>  early patch changes vfs_rmdir() to both consume the dentry and drop the
>  lock on failure.
>
>  After these new APIs are refined, agreed, and applied I will have a
>  collection of patches to roll them out throughout the kernel.  Then we
>  can start/continue discussing a new approach to locking which allows
>  directory operations to proceed in parallel.
>
>  If you want a sneak peek at some of this future work - for context
>  mostly - my current devel code is at https://github.com/neilbrown/linux.git
>  in a branch "pdirops".  Be warned that a lot of the later code is under
>  development, is known to be wrong, and doesn't even compile.  Not today
>  anyway.  The rolling out of the new APIs is fairly mature though.
>
>  Please review and suggest better names, or tell me that my choices are adequate.
>  And find the bugs in the code too :-)
>
>  I haven't cc:ed the maintains of the non-VFS code that the patches
>  touch.  I can do that once the approach and names have been approved.
>
> Thanks,
> NeilBrown
>
>
>  [PATCH 1/7] VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
>  [PATCH 2/7] VFS: introduce done_dentry_lookup()
>  [PATCH 3/7] VFS: Change vfs_mkdir() to unlock on failure.
>  [PATCH 4/7] VFS: introduce dentry_lookup() and friends
>  [PATCH 5/7] VFS: add dentry_lookup_killable()
>  [PATCH 6/7] VFS: add rename_lookup()
>  [PATCH 7/7] VFS: introduce dentry_lock_in()
>