A few callers operate on a dentry which they already have - unlike the
normal case where a lookup proceeds an operation.
For these callers dentry_lookup_continue() is provided where other
callers would use dentry_lookup(). The call will fail if, after the
lock was gained, the child is no longer a child of the given parent.
There are a couple of callers that want to lock a dentry in whatever
its current parent is. For these a NULL parent can be passed, in which
case ->d_parent is used. In this case the call cannot fail.
The idea behind the name is that the actual lookup occurred some time
ago, and now we are continuing with an operation on the dentry.
When the operation completes done_dentry_lookup() must be called. An
extra reference is taken when the dentry_lookup_continue() call succeeds
and will be dropped by done_dentry_lookup().
This will be used in smb/server, ecryptfs, and overlayfs, each of which
have their own lock_parent() or parent_lock() or similar; and a few
other places which lock the parent but don't check if the parent is
still correct (often because rename isn't supported so parent cannot be
incorrect).
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/namei.c | 39 +++++++++++++++++++++++++++++++++++++++
include/linux/namei.h | 2 ++
2 files changed, 41 insertions(+)
diff --git a/fs/namei.c b/fs/namei.c
index 7af9b464886a..df21b6fa5a0e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1874,6 +1874,45 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap,
}
EXPORT_SYMBOL(dentry_lookup_killable);
+/**
+ * dentry_lookup_continue: lock a dentry if it is still in the given parent, prior to dir ops
+ * @child: the dentry to lock
+ * @parent: the dentry of the assumed parent
+ *
+ * The child is locked - currently by taking i_rwsem on the parent - to
+ * prepare for create/remove operations. If the given parent is not
+ * %NULL and is no longer the parent of the dentry after the lock is
+ * gained, the lock is released and the call fails (returns
+ * ERR_PTR(-EINVAL).
+ *
+ * On success a reference to the child is taken and returned. The lock
+ * and reference must both be dropped by done_dentry_lookup() after the
+ * operation completes.
+ */
+struct dentry *dentry_lookup_continue(struct dentry *child,
+ struct dentry *parent)
+{
+ struct dentry *p = parent;
+
+again:
+ if (!parent)
+ p = dget_parent(child);
+ inode_lock_nested(d_inode(p), I_MUTEX_PARENT);
+ if (child->d_parent != p) {
+ inode_unlock(d_inode(p));
+ if (!parent) {
+ dput(p);
+ goto again;
+ }
+ return ERR_PTR(-EINVAL);
+ }
+ if (!parent)
+ dput(p);
+ /* get the child to balance with done_dentry_lookup() which puts it. */
+ return dget(child);
+}
+EXPORT_SYMBOL(dentry_lookup_continue);
+
/**
* done_dentry_lookup - finish a lookup used for create/delete
* @dentry: the target dentry
diff --git a/include/linux/namei.h b/include/linux/namei.h
index facb5852afa9..67eef91603cc 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -88,6 +88,8 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap,
unsigned int lookup_flags);
struct dentry *dentry_lookup_noperm(struct qstr *name, struct dentry *base,
unsigned int lookup_flags);
+struct dentry *dentry_lookup_continue(struct dentry *child,
+ struct dentry *parent);
void done_dentry_lookup(struct dentry *dentry);
/* no_free_ptr() must not be used here - use dget() */
DEFINE_FREE(dentry_lookup, struct dentry *, if (_T) done_dentry_lookup(_T))
--
2.50.0.107.gf914562f5916.dirty
On Wed, Aug 13, 2025 at 1:53 AM NeilBrown <neil@brown.name> wrote: > > A few callers operate on a dentry which they already have - unlike the > normal case where a lookup proceeds an operation. > > For these callers dentry_lookup_continue() is provided where other > callers would use dentry_lookup(). The call will fail if, after the > lock was gained, the child is no longer a child of the given parent. > > There are a couple of callers that want to lock a dentry in whatever > its current parent is. For these a NULL parent can be passed, in which > case ->d_parent is used. In this case the call cannot fail. > > The idea behind the name is that the actual lookup occurred some time > ago, and now we are continuing with an operation on the dentry. > > When the operation completes done_dentry_lookup() must be called. An > extra reference is taken when the dentry_lookup_continue() call succeeds > and will be dropped by done_dentry_lookup(). > > This will be used in smb/server, ecryptfs, and overlayfs, each of which > have their own lock_parent() or parent_lock() or similar; and a few > other places which lock the parent but don't check if the parent is > still correct (often because rename isn't supported so parent cannot be > incorrect). > > Signed-off-by: NeilBrown <neil@brown.name> > --- > fs/namei.c | 39 +++++++++++++++++++++++++++++++++++++++ > include/linux/namei.h | 2 ++ > 2 files changed, 41 insertions(+) > > diff --git a/fs/namei.c b/fs/namei.c > index 7af9b464886a..df21b6fa5a0e 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1874,6 +1874,45 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap, > } > EXPORT_SYMBOL(dentry_lookup_killable); > > +/** > + * dentry_lookup_continue: lock a dentry if it is still in the given parent, prior to dir ops > + * @child: the dentry to lock > + * @parent: the dentry of the assumed parent > + * > + * The child is locked - currently by taking i_rwsem on the parent - to > + * prepare for create/remove operations. If the given parent is not > + * %NULL and is no longer the parent of the dentry after the lock is > + * gained, the lock is released and the call fails (returns > + * ERR_PTR(-EINVAL). > + * > + * On success a reference to the child is taken and returned. The lock > + * and reference must both be dropped by done_dentry_lookup() after the > + * operation completes. > + */ > +struct dentry *dentry_lookup_continue(struct dentry *child, > + struct dentry *parent) > +{ > + struct dentry *p = parent; > + > +again: > + if (!parent) > + p = dget_parent(child); > + inode_lock_nested(d_inode(p), I_MUTEX_PARENT); > + if (child->d_parent != p) { || d_unhashed(child)) ;) and what about silly renames? are those also d_unhashed()? Thanks, Amir.
On Tue, Aug 12, 2025 at 12:25:07PM +1000, NeilBrown wrote: > A few callers operate on a dentry which they already have - unlike the > normal case where a lookup proceeds an operation. > > For these callers dentry_lookup_continue() is provided where other > callers would use dentry_lookup(). The call will fail if, after the > lock was gained, the child is no longer a child of the given parent. > > There are a couple of callers that want to lock a dentry in whatever > its current parent is. For these a NULL parent can be passed, in which > case ->d_parent is used. In this case the call cannot fail. > > The idea behind the name is that the actual lookup occurred some time > ago, and now we are continuing with an operation on the dentry. > > When the operation completes done_dentry_lookup() must be called. An > extra reference is taken when the dentry_lookup_continue() call succeeds > and will be dropped by done_dentry_lookup(). > > This will be used in smb/server, ecryptfs, and overlayfs, each of which > have their own lock_parent() or parent_lock() or similar; and a few > other places which lock the parent but don't check if the parent is > still correct (often because rename isn't supported so parent cannot be > incorrect). I would really like the see the conversion of these callers. You are asking for a buy-in for a primitive with specific semantics; that's hard to review without seeing how it will be used.
On Wed, 13 Aug 2025, Al Viro wrote: > On Tue, Aug 12, 2025 at 12:25:07PM +1000, NeilBrown wrote: > > A few callers operate on a dentry which they already have - unlike the > > normal case where a lookup proceeds an operation. > > > > For these callers dentry_lookup_continue() is provided where other > > callers would use dentry_lookup(). The call will fail if, after the > > lock was gained, the child is no longer a child of the given parent. > > > > There are a couple of callers that want to lock a dentry in whatever > > its current parent is. For these a NULL parent can be passed, in which > > case ->d_parent is used. In this case the call cannot fail. > > > > The idea behind the name is that the actual lookup occurred some time > > ago, and now we are continuing with an operation on the dentry. > > > > When the operation completes done_dentry_lookup() must be called. An > > extra reference is taken when the dentry_lookup_continue() call succeeds > > and will be dropped by done_dentry_lookup(). > > > > This will be used in smb/server, ecryptfs, and overlayfs, each of which > > have their own lock_parent() or parent_lock() or similar; and a few > > other places which lock the parent but don't check if the parent is > > still correct (often because rename isn't supported so parent cannot be > > incorrect). > > I would really like the see the conversion of these callers. You are > asking for a buy-in for a primitive with specific semantics; that's > hard to review without seeing how it will be used. > All, or just some? I use dentry_lookup_continue() in: cachefiles: 4 times ecryptfs: once overlayfs: twice smb/server: once apparmor: once Maybe I could include all in the one patch... NeilBrown
© 2016 - 2025 Red Hat, Inc.