[PATCH 04/11] VFS: introduce dentry_lookup_continue()

NeilBrown posted 11 patches 1 month, 3 weeks ago
There is a newer version of this series
[PATCH 04/11] VFS: introduce dentry_lookup_continue()
Posted by NeilBrown 1 month, 3 weeks ago
A few callers operate on a dentry which they already have - unlike the
normal case where a lookup proceeds an operation.

For these callers dentry_lookup_continue() is provided where other
callers would use dentry_lookup().  The call will fail if, after the
lock was gained, the child is no longer a child of the given parent.

There are a couple of callers that want to lock a dentry in whatever
its current parent is.  For these a NULL parent can be passed, in which
case ->d_parent is used.  In this case the call cannot fail.

The idea behind the name is that the actual lookup occurred some time
ago, and now we are continuing with an operation on the dentry.

When the operation completes done_dentry_lookup() must be called.  An
extra reference is taken when the dentry_lookup_continue() call succeeds
and will be dropped by done_dentry_lookup().

This will be used in smb/server, ecryptfs, and overlayfs, each of which
have their own lock_parent() or parent_lock() or similar; and a few
other places which lock the parent but don't check if the parent is
still correct (often because rename isn't supported so parent cannot be
incorrect).

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/namei.c            | 39 +++++++++++++++++++++++++++++++++++++++
 include/linux/namei.h |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 7af9b464886a..df21b6fa5a0e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1874,6 +1874,45 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap,
 }
 EXPORT_SYMBOL(dentry_lookup_killable);
 
+/**
+ * dentry_lookup_continue: lock a dentry if it is still in the given parent, prior to dir ops
+ * @child: the dentry to lock
+ * @parent: the dentry of the assumed parent
+ *
+ * The child is locked - currently by taking i_rwsem on the parent - to
+ * prepare for create/remove operations.  If the given parent is not
+ * %NULL and is no longer the parent of the dentry after the lock is
+ * gained, the lock is released and the call fails (returns
+ * ERR_PTR(-EINVAL).
+ *
+ * On success a reference to the child is taken and returned.  The lock
+ * and reference must both be dropped by done_dentry_lookup() after the
+ * operation completes.
+ */
+struct dentry *dentry_lookup_continue(struct dentry *child,
+				      struct dentry *parent)
+{
+	struct dentry *p = parent;
+
+again:
+	if (!parent)
+		p = dget_parent(child);
+	inode_lock_nested(d_inode(p), I_MUTEX_PARENT);
+	if (child->d_parent != p) {
+		inode_unlock(d_inode(p));
+		if (!parent) {
+			dput(p);
+			goto again;
+		}
+		return ERR_PTR(-EINVAL);
+	}
+	if (!parent)
+		dput(p);
+	/* get the child to balance with done_dentry_lookup() which puts it. */
+	return dget(child);
+}
+EXPORT_SYMBOL(dentry_lookup_continue);
+
 /**
  * done_dentry_lookup - finish a lookup used for create/delete
  * @dentry:  the target dentry
diff --git a/include/linux/namei.h b/include/linux/namei.h
index facb5852afa9..67eef91603cc 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -88,6 +88,8 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap,
 				      unsigned int lookup_flags);
 struct dentry *dentry_lookup_noperm(struct qstr *name, struct dentry *base,
 				      unsigned int lookup_flags);
+struct dentry *dentry_lookup_continue(struct dentry *child,
+				      struct dentry *parent);
 void done_dentry_lookup(struct dentry *dentry);
 /* no_free_ptr() must not be used here - use dget() */
 DEFINE_FREE(dentry_lookup, struct dentry *, if (_T) done_dentry_lookup(_T))
-- 
2.50.0.107.gf914562f5916.dirty
Re: [PATCH 04/11] VFS: introduce dentry_lookup_continue()
Posted by Amir Goldstein 1 month, 2 weeks ago
On Wed, Aug 13, 2025 at 1:53 AM NeilBrown <neil@brown.name> wrote:
>
> A few callers operate on a dentry which they already have - unlike the
> normal case where a lookup proceeds an operation.
>
> For these callers dentry_lookup_continue() is provided where other
> callers would use dentry_lookup().  The call will fail if, after the
> lock was gained, the child is no longer a child of the given parent.
>
> There are a couple of callers that want to lock a dentry in whatever
> its current parent is.  For these a NULL parent can be passed, in which
> case ->d_parent is used.  In this case the call cannot fail.
>
> The idea behind the name is that the actual lookup occurred some time
> ago, and now we are continuing with an operation on the dentry.
>
> When the operation completes done_dentry_lookup() must be called.  An
> extra reference is taken when the dentry_lookup_continue() call succeeds
> and will be dropped by done_dentry_lookup().
>
> This will be used in smb/server, ecryptfs, and overlayfs, each of which
> have their own lock_parent() or parent_lock() or similar; and a few
> other places which lock the parent but don't check if the parent is
> still correct (often because rename isn't supported so parent cannot be
> incorrect).
>
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  fs/namei.c            | 39 +++++++++++++++++++++++++++++++++++++++
>  include/linux/namei.h |  2 ++
>  2 files changed, 41 insertions(+)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 7af9b464886a..df21b6fa5a0e 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1874,6 +1874,45 @@ struct dentry *dentry_lookup_killable(struct mnt_idmap *idmap,
>  }
>  EXPORT_SYMBOL(dentry_lookup_killable);
>
> +/**
> + * dentry_lookup_continue: lock a dentry if it is still in the given parent, prior to dir ops
> + * @child: the dentry to lock
> + * @parent: the dentry of the assumed parent
> + *
> + * The child is locked - currently by taking i_rwsem on the parent - to
> + * prepare for create/remove operations.  If the given parent is not
> + * %NULL and is no longer the parent of the dentry after the lock is
> + * gained, the lock is released and the call fails (returns
> + * ERR_PTR(-EINVAL).
> + *
> + * On success a reference to the child is taken and returned.  The lock
> + * and reference must both be dropped by done_dentry_lookup() after the
> + * operation completes.
> + */
> +struct dentry *dentry_lookup_continue(struct dentry *child,
> +                                     struct dentry *parent)
> +{
> +       struct dentry *p = parent;
> +
> +again:
> +       if (!parent)
> +               p = dget_parent(child);
> +       inode_lock_nested(d_inode(p), I_MUTEX_PARENT);
> +       if (child->d_parent != p) {

|| d_unhashed(child))

;)

and what about silly renames? are those also d_unhashed()?

Thanks,
Amir.
Re: [PATCH 04/11] VFS: introduce dentry_lookup_continue()
Posted by Al Viro 1 month, 3 weeks ago
On Tue, Aug 12, 2025 at 12:25:07PM +1000, NeilBrown wrote:
> A few callers operate on a dentry which they already have - unlike the
> normal case where a lookup proceeds an operation.
> 
> For these callers dentry_lookup_continue() is provided where other
> callers would use dentry_lookup().  The call will fail if, after the
> lock was gained, the child is no longer a child of the given parent.
> 
> There are a couple of callers that want to lock a dentry in whatever
> its current parent is.  For these a NULL parent can be passed, in which
> case ->d_parent is used.  In this case the call cannot fail.
> 
> The idea behind the name is that the actual lookup occurred some time
> ago, and now we are continuing with an operation on the dentry.
> 
> When the operation completes done_dentry_lookup() must be called.  An
> extra reference is taken when the dentry_lookup_continue() call succeeds
> and will be dropped by done_dentry_lookup().
> 
> This will be used in smb/server, ecryptfs, and overlayfs, each of which
> have their own lock_parent() or parent_lock() or similar; and a few
> other places which lock the parent but don't check if the parent is
> still correct (often because rename isn't supported so parent cannot be
> incorrect).

I would really like the see the conversion of these callers.  You are
asking for a buy-in for a primitive with specific semantics; that's
hard to review without seeing how it will be used.
Re: [PATCH 04/11] VFS: introduce dentry_lookup_continue()
Posted by NeilBrown 1 month, 3 weeks ago
On Wed, 13 Aug 2025, Al Viro wrote:
> On Tue, Aug 12, 2025 at 12:25:07PM +1000, NeilBrown wrote:
> > A few callers operate on a dentry which they already have - unlike the
> > normal case where a lookup proceeds an operation.
> > 
> > For these callers dentry_lookup_continue() is provided where other
> > callers would use dentry_lookup().  The call will fail if, after the
> > lock was gained, the child is no longer a child of the given parent.
> > 
> > There are a couple of callers that want to lock a dentry in whatever
> > its current parent is.  For these a NULL parent can be passed, in which
> > case ->d_parent is used.  In this case the call cannot fail.
> > 
> > The idea behind the name is that the actual lookup occurred some time
> > ago, and now we are continuing with an operation on the dentry.
> > 
> > When the operation completes done_dentry_lookup() must be called.  An
> > extra reference is taken when the dentry_lookup_continue() call succeeds
> > and will be dropped by done_dentry_lookup().
> > 
> > This will be used in smb/server, ecryptfs, and overlayfs, each of which
> > have their own lock_parent() or parent_lock() or similar; and a few
> > other places which lock the parent but don't check if the parent is
> > still correct (often because rename isn't supported so parent cannot be
> > incorrect).
> 
> I would really like the see the conversion of these callers.  You are
> asking for a buy-in for a primitive with specific semantics; that's
> hard to review without seeing how it will be used.
> 

All, or just some?
I use dentry_lookup_continue() in:
  cachefiles: 4 times
  ecryptfs: once
  overlayfs: twice
  smb/server: once
  apparmor: once

Maybe I could include all in the one patch...

NeilBrown