[PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED

Mateusz Guzik posted 1 patch 1 month, 3 weeks ago
There is a newer version of this series
fs/namei.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
[PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Mateusz Guzik 1 month, 3 weeks ago
Otherwise the slowpath can be taken by the caller, defeating the flag.

This regressed after calls to legitimize_links() started being
conditionally elided and stems from the routine always failing
after seeing the flag, regardless if there were any links.

In order to address both the bug and the weird semantics make it illegal
to call legitimize_links() with LOOKUP_CACHED and handle the problem at
the two callsites.

Pull up ->depth = 0 to drop_links() to avoid repeating it in the
callers.

One remaining weirdness is terminate_walk() walking the symlink stack
after drop_links().

Fixes: 7c179096e77eca21 ("fs: add predicts based on nd->depth")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---

v2:
- handle terminate_walk looking at nd->depth after drop_links

 fs/namei.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..69d0aa9ad2a8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -774,6 +774,7 @@ static void drop_links(struct nameidata *nd)
 		do_delayed_call(&last->done);
 		clear_delayed_call(&last->done);
 	}
+	nd->depth = 0;
 }
 
 static void leave_rcu(struct nameidata *nd)
@@ -785,12 +786,13 @@ static void leave_rcu(struct nameidata *nd)
 
 static void terminate_walk(struct nameidata *nd)
 {
-	if (unlikely(nd->depth))
+	int depth = nd->depth;
+	if (unlikely(depth))
 		drop_links(nd);
 	if (!(nd->flags & LOOKUP_RCU)) {
 		int i;
 		path_put(&nd->path);
-		for (i = 0; i < nd->depth; i++)
+		for (i = 0; i < depth; i++)
 			path_put(&nd->stack[i].link);
 		if (nd->state & ND_ROOT_GRABBED) {
 			path_put(&nd->root);
@@ -799,7 +801,7 @@ static void terminate_walk(struct nameidata *nd)
 	} else {
 		leave_rcu(nd);
 	}
-	nd->depth = 0;
+	VFS_BUG_ON(nd->depth);
 	nd->path.mnt = NULL;
 	nd->path.dentry = NULL;
 }
@@ -830,11 +832,9 @@ static inline bool legitimize_path(struct nameidata *nd,
 static bool legitimize_links(struct nameidata *nd)
 {
 	int i;
-	if (unlikely(nd->flags & LOOKUP_CACHED)) {
-		drop_links(nd);
-		nd->depth = 0;
-		return false;
-	}
+
+	VFS_BUG_ON(nd->flags & LOOKUP_CACHED);
+
 	for (i = 0; i < nd->depth; i++) {
 		struct saved *last = nd->stack + i;
 		if (unlikely(!legitimize_path(nd, &last->link, last->seq))) {
@@ -883,6 +883,10 @@ static bool try_to_unlazy(struct nameidata *nd)
 
 	BUG_ON(!(nd->flags & LOOKUP_RCU));
 
+	if (unlikely(nd->flags & LOOKUP_CACHED)) {
+		drop_links(nd);
+		goto out1;
+	}
 	if (unlikely(nd->depth && !legitimize_links(nd)))
 		goto out1;
 	if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
@@ -918,6 +922,10 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry)
 	int res;
 	BUG_ON(!(nd->flags & LOOKUP_RCU));
 
+	if (unlikely(nd->flags & LOOKUP_CACHED)) {
+		drop_links(nd);
+		goto out2;
+	}
 	if (unlikely(nd->depth && !legitimize_links(nd)))
 		goto out2;
 	res = __legitimize_mnt(nd->path.mnt, nd->m_seq);
-- 
2.48.1
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Al Viro 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 09:47:04AM +0100, Mateusz Guzik wrote:
> Otherwise the slowpath can be taken by the caller, defeating the flag.
> 
> This regressed after calls to legitimize_links() started being
> conditionally elided and stems from the routine always failing
> after seeing the flag, regardless if there were any links.
> 
> In order to address both the bug and the weird semantics make it illegal
> to call legitimize_links() with LOOKUP_CACHED and handle the problem at
> the two callsites.
> 
> Pull up ->depth = 0 to drop_links() to avoid repeating it in the
> callers.
> 
> One remaining weirdness is terminate_walk() walking the symlink stack
> after drop_links().

What weirdness?  If we are not in RCU mode, we need to drop symlink bodies
*and* drop symlink references?
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Mateusz Guzik 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 10:07 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Wed, Dec 17, 2025 at 09:47:04AM +0100, Mateusz Guzik wrote:
> > One remaining weirdness is terminate_walk() walking the symlink stack
> > after drop_links().
>
> What weirdness?  If we are not in RCU mode, we need to drop symlink bodies
> *and* drop symlink references?

One would expect a routine named drop_links() would handle the
entirety of clean up of symlinks.

Seeing how it only handles some of it, it should be renamed to better
indicate what it is doing, but that's a potential clean up for later.
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Al Viro 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 10:11:04AM +0100, Mateusz Guzik wrote:
> On Wed, Dec 17, 2025 at 10:07 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Wed, Dec 17, 2025 at 09:47:04AM +0100, Mateusz Guzik wrote:
> > > One remaining weirdness is terminate_walk() walking the symlink stack
> > > after drop_links().
> >
> > What weirdness?  If we are not in RCU mode, we need to drop symlink bodies
> > *and* drop symlink references?
> 
> One would expect a routine named drop_links() would handle the
> entirety of clean up of symlinks.
> 
> Seeing how it only handles some of it, it should be renamed to better
> indicate what it is doing, but that's a potential clean up for later.

Take a look at the callers.  All 3 of them.

1) terminate_walk(): drop all symlink bodies, in non-RCU mode drop
all paths as well.

2) a couple in legitimize_links(): *always* called in RCU mode.  That's
the whole point - trying to grab references to a bunch of dentries/mounts,
so that we could continue in non-RCU mode from that point on.  What should
we do if we'd grabbed some of those references, but failed halfway through
the stack?

We *can't* do path_put() there - not under rcu_read_lock().  And we can't
delay dropping the link bodies past rcu_read_unlock().

Note that this state has
	nd->depth link bodies in stack, all need to be droped before
rcu_read_unlock()
	first K link references in stack that need to be dropped after
rcu_read_unlock()
	nd->depth - K link references in stack that do _not_ need to
be dropped.

Solution: have link bodies dropped, callbacks cleared and nd->depth
reset to K.  The caller of legitimate_links() immediately drops out
of RCU mode and we proceed to terminate_walk(), same as we would
on an error in non-RCU mode.

This case is on a slow path; we could microoptimize it, but result
would be really harder to understand.
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Mateusz Guzik 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 11:05 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Wed, Dec 17, 2025 at 10:11:04AM +0100, Mateusz Guzik wrote:
> > On Wed, Dec 17, 2025 at 10:07 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Wed, Dec 17, 2025 at 09:47:04AM +0100, Mateusz Guzik wrote:
> > > > One remaining weirdness is terminate_walk() walking the symlink stack
> > > > after drop_links().
> > >
> > > What weirdness?  If we are not in RCU mode, we need to drop symlink bodies
> > > *and* drop symlink references?
> >
> > One would expect a routine named drop_links() would handle the
> > entirety of clean up of symlinks.
> >
> > Seeing how it only handles some of it, it should be renamed to better
> > indicate what it is doing, but that's a potential clean up for later.
>
> Take a look at the callers.  All 3 of them.
>
> 1) terminate_walk(): drop all symlink bodies, in non-RCU mode drop
> all paths as well.
>
> 2) a couple in legitimize_links(): *always* called in RCU mode.  That's
> the whole point - trying to grab references to a bunch of dentries/mounts,
> so that we could continue in non-RCU mode from that point on.  What should
> we do if we'd grabbed some of those references, but failed halfway through
> the stack?
>
> We *can't* do path_put() there - not under rcu_read_lock().  And we can't
> delay dropping the link bodies past rcu_read_unlock().
>
> Note that this state has
>         nd->depth link bodies in stack, all need to be droped before
> rcu_read_unlock()
>         first K link references in stack that need to be dropped after
> rcu_read_unlock()
>         nd->depth - K link references in stack that do _not_ need to
> be dropped.
>
> Solution: have link bodies dropped, callbacks cleared and nd->depth
> reset to K.  The caller of legitimate_links() immediately drops out
> of RCU mode and we proceed to terminate_walk(), same as we would
> on an error in non-RCU mode.
>
> This case is on a slow path; we could microoptimize it, but result
> would be really harder to understand.

I'm not arguing for drop_links() to change behavior, but for it to be
renamed to something which indicates there is still potential
symlink-related clean up to do.

As an outsider, a routine named drop_${whatever} normally suggests the
${whatever} is fully taken care of after the call, which is not the
case here.
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Al Viro 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 11:13:22AM +0100, Mateusz Guzik wrote:

> I'm not arguing for drop_links() to change behavior, but for it to be
> renamed to something which indicates there is still potential
> symlink-related clean up to do.
> 
> As an outsider, a routine named drop_${whatever} normally suggests the
> ${whatever} is fully taken care of after the call, which is not the
> case here.

*shrug*

drop_link_bodies()?  Seriously, though - the real source of headache
had been a decision (mine, IIRC) to stash LOOKUP_CACHED logics into
legitimize_links() rather than having it done in both of its callers
(try_to_unlazy() and try_to_unlazy_next()).

It could be lifted into both callers instead, but we'd need to
	* remember that it should come *before* any references get
grabbed (and there would be a constant itch to move it down -
it would even work, and past legitimize_links() we wouldn't need
drop_links(), but it would make the thing potentially do seriously
blocking IO, contrary to LOOKUP_CACHED intent)
	* remember to keep drop_links()/resetting ->depth if done
before legitimize_links().
	* keep these two in sync.

If you have a cleaner solution for that, I'm all ears.  Really.
I'm not happy with details of that area (lazy to nonlazy transitions),
but so far I hadn't been able to make it more obviously correct.
And I'd looked at your original patches and missed that fun ;-/
Re: [PATCH v2] fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
Posted by Mateusz Guzik 1 month, 3 weeks ago
On Wed, Dec 17, 2025 at 11:13 AM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> On Wed, Dec 17, 2025 at 11:05 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Wed, Dec 17, 2025 at 10:11:04AM +0100, Mateusz Guzik wrote:
> > > On Wed, Dec 17, 2025 at 10:07 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > >
> > > > On Wed, Dec 17, 2025 at 09:47:04AM +0100, Mateusz Guzik wrote:
> > > > > One remaining weirdness is terminate_walk() walking the symlink stack
> > > > > after drop_links().
> > > >
> > > > What weirdness?  If we are not in RCU mode, we need to drop symlink bodies
> > > > *and* drop symlink references?
> > >
> > > One would expect a routine named drop_links() would handle the
> > > entirety of clean up of symlinks.
> > >
> > > Seeing how it only handles some of it, it should be renamed to better
> > > indicate what it is doing, but that's a potential clean up for later.
> >
> > Take a look at the callers.  All 3 of them.
> >
> > 1) terminate_walk(): drop all symlink bodies, in non-RCU mode drop
> > all paths as well.
> >
> > 2) a couple in legitimize_links(): *always* called in RCU mode.  That's
> > the whole point - trying to grab references to a bunch of dentries/mounts,
> > so that we could continue in non-RCU mode from that point on.  What should
> > we do if we'd grabbed some of those references, but failed halfway through
> > the stack?
> >
> > We *can't* do path_put() there - not under rcu_read_lock().  And we can't
> > delay dropping the link bodies past rcu_read_unlock().
> >
> > Note that this state has
> >         nd->depth link bodies in stack, all need to be droped before
> > rcu_read_unlock()
> >         first K link references in stack that need to be dropped after
> > rcu_read_unlock()
> >         nd->depth - K link references in stack that do _not_ need to
> > be dropped.
> >
> > Solution: have link bodies dropped, callbacks cleared and nd->depth
> > reset to K.  The caller of legitimate_links() immediately drops out
> > of RCU mode and we proceed to terminate_walk(), same as we would
> > on an error in non-RCU mode.
> >
> > This case is on a slow path; we could microoptimize it, but result
> > would be really harder to understand.
>
> I'm not arguing for drop_links() to change behavior, but for it to be
> renamed to something which indicates there is still potential
> symlink-related clean up to do.
>
> As an outsider, a routine named drop_${whatever} normally suggests the
> ${whatever} is fully taken care of after the call, which is not the
> case here.

Completely untested clean up for illustrative purposes:
static void links_issue_delayed_calls(struct nameidata *nd)
{
        int i = nd->depth;
        while (i--) {
                struct saved *last = nd->stack + i;
                do_delayed_call(&last->done);
                clear_delayed_call(&last->done);
        }
}

static void links_cleanup_rcu(struct nameidata *nd)
{
        VFS_BUG_ON(!(nd->flags & LOOKUP_RCU));

        if (likely(!nd->depth))
                return;

        links_issue_delayed_calls(nd);
        nd->depth = 0;
}

static void links_cleanup_ref(struct nameidata *nd)
{
        VFS_BUG_ON(nd->flags & LOOKUP_RCU);

        if (likely(!nd->depth))
                return;

        links_issue_delayed_calls(nd);

        path_put(&nd->path);
        for (int i = 0; i < nd->depth; i++)
                path_put(&nd->stack[i].link);
        if (nd->state & ND_ROOT_GRABBED) {
                path_put(&nd->root);
                nd->state &= ~ND_ROOT_GRABBED;
        }
        nd->depth = 0;
}

static void leave_rcu(struct nameidata *nd)
{
        nd->flags &= ~LOOKUP_RCU;
        nd->seq = nd->next_seq = 0;
        rcu_read_unlock();
}

static void terminate_walk(struct nameidata *nd)
{
        if (nd->flags & LOOKUP_RCU) {
                links_cleanup_rcu(nd);
                leave_rcu(nd);
        } else {
                links_cleanup_ref(nd);
        }
        VFS_BUG_ON(nd->depth);
        nd->path.mnt = NULL;
        nd->path.dentry = NULL;
}