fs/fuse/dir.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
From: NeilBrown <neil@brown.name>
The recent conversion of fuse_reverse_inval_entry() to use
start_removing() was wrong.
As Val Packett points out the original code did not call ->lookup
while the new code does. This can lead to a deadlock.
Rather than using full_name_hash() and d_lookup() as the old code
did, we can use try_lookup_noperm() which combines these. Then
the result can be given to start_removing_dentry() to get the required
locks for removal. We then double check that the name hasn't
changed.
As 'dir' needs to be used several times now, we load the dput() until
the end, and initialise to NULL so dput() is always safe.
Reported-by: Val Packett <val@packett.cool>
Closes: https://lore.kernel.org/all/6713ea38-b583-4c86-b74a-bea55652851d@packett.cool
Fixes: c9ba789dad15 ("VFS: introduce start_creating_noperm() and start_removing_noperm()")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/fuse/dir.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index a0d5b302bcc2..8384fa96cf53 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1390,8 +1390,8 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
{
int err = -ENOTDIR;
struct inode *parent;
- struct dentry *dir;
- struct dentry *entry;
+ struct dentry *dir = NULL;
+ struct dentry *entry = NULL;
parent = fuse_ilookup(fc, parent_nodeid, NULL);
if (!parent)
@@ -1404,11 +1404,19 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
dir = d_find_alias(parent);
if (!dir)
goto put_parent;
-
- entry = start_removing_noperm(dir, name);
- dput(dir);
- if (IS_ERR(entry))
- goto put_parent;
+ while (!entry) {
+ struct dentry *child = try_lookup_noperm(name, dir);
+ if (!child || IS_ERR(child))
+ goto put_parent;
+ entry = start_removing_dentry(dir, child);
+ dput(child);
+ if (IS_ERR(entry))
+ goto put_parent;
+ if (!d_same_name(entry, dir, name)) {
+ end_removing(entry);
+ entry = NULL;
+ }
+ }
fuse_dir_changed(parent);
if (!(flags & FUSE_EXPIRE_ONLY))
@@ -1446,6 +1454,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
end_removing(entry);
put_parent:
+ dput(dir);
iput(parent);
return err;
}
--
2.50.0.107.gf914562f5916.dirty
On Sun, Nov 30, 2025 at 11:06 PM NeilBrown <neilb@ownmail.net> wrote:
>
>
> From: NeilBrown <neil@brown.name>
>
> The recent conversion of fuse_reverse_inval_entry() to use
> start_removing() was wrong.
> As Val Packett points out the original code did not call ->lookup
> while the new code does. This can lead to a deadlock.
>
> Rather than using full_name_hash() and d_lookup() as the old code
> did, we can use try_lookup_noperm() which combines these. Then
> the result can be given to start_removing_dentry() to get the required
> locks for removal. We then double check that the name hasn't
> changed.
>
> As 'dir' needs to be used several times now, we load the dput() until
> the end, and initialise to NULL so dput() is always safe.
>
> Reported-by: Val Packett <val@packett.cool>
> Closes: https://lore.kernel.org/all/6713ea38-b583-4c86-b74a-bea55652851d@packett.cool
> Fixes: c9ba789dad15 ("VFS: introduce start_creating_noperm() and start_removing_noperm()")
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
> fs/fuse/dir.c | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index a0d5b302bcc2..8384fa96cf53 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -1390,8 +1390,8 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
> {
> int err = -ENOTDIR;
> struct inode *parent;
> - struct dentry *dir;
> - struct dentry *entry;
> + struct dentry *dir = NULL;
> + struct dentry *entry = NULL;
>
> parent = fuse_ilookup(fc, parent_nodeid, NULL);
> if (!parent)
> @@ -1404,11 +1404,19 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
> dir = d_find_alias(parent);
> if (!dir)
> goto put_parent;
> -
> - entry = start_removing_noperm(dir, name);
> - dput(dir);
> - if (IS_ERR(entry))
> - goto put_parent;
> + while (!entry) {
> + struct dentry *child = try_lookup_noperm(name, dir);
> + if (!child || IS_ERR(child))
> + goto put_parent;
> + entry = start_removing_dentry(dir, child);
> + dput(child);
> + if (IS_ERR(entry))
> + goto put_parent;
> + if (!d_same_name(entry, dir, name)) {
> + end_removing(entry);
> + entry = NULL;
> + }
> + }
Can you explain why it is so important to use
start_removing_dentry() around shrink_dcache_parent()?
Is there a problem with reverting the change in this function
instead of accomodating start_removing_dentry()?
I don't think there is a point in optimizing parallel dir operations
with FUSE server cache invalidation, but maybe I am missing
something.
Thanks,
Amir.
On Mon, Dec 01, 2025 at 09:22:54AM +0100, Amir Goldstein wrote: > I don't think there is a point in optimizing parallel dir operations > with FUSE server cache invalidation, but maybe I am missing > something. The interesting part is the expected semantics of operation; d_invalidate() side definitely doesn't need any of that cruft, but I would really like to understand what that function is supposed to do. Miklos, could you post a brain dump on that?
On Mon, 1 Dec 2025 at 09:33, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Dec 01, 2025 at 09:22:54AM +0100, Amir Goldstein wrote:
>
> > I don't think there is a point in optimizing parallel dir operations
> > with FUSE server cache invalidation, but maybe I am missing
> > something.
>
> The interesting part is the expected semantics of operation;
> d_invalidate() side definitely doesn't need any of that cruft,
> but I would really like to understand what that function
> is supposed to do.
>
> Miklos, could you post a brain dump on that?
This function is supposed to invalidate a dentry due to remote changes
(FUSE_NOTIFY_INVAL_ENTRY). Originally it was supplied a parent ID and
a name and called d_invalidate() on the looked up dentry.
Then it grew a variant (FUSE_NOTIFY_DELETE) that was also supplied a
child ID, which was matched against the looked up inode. This was
commit 451d0f599934 ("FUSE: Notifying the kernel of deletion."),
Apparently this worked around the fact that at that time
d_invalidate() returned -EBUSY if the target was still in use and
didn't unhash the dentry in that case.
That was later changed by commit bafc9b754f75 ("vfs: More precise
tests in d_invalidate") to unconditionally unhash the target, which
effectively made FUSE_NOTIFY_INVAL_ENTRY and FUSE_NOTIFY_DELETE
equivalent and the code in question unnecessary.
For the future, we could also introduce FUSE_NOTIFY_MOVE, that would
differentiate between a delete and a move, while
FUSE_NOTIFY_INVAL_ENTRY would continue to be the common (deleted or
moved) notification.
Attaching untested patch to remove this cruft.
Thanks,
Miklos
On Mon, Dec 01, 2025 at 03:03:08PM +0100, Miklos Szeredi wrote:
> On Mon, 1 Dec 2025 at 09:33, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Mon, Dec 01, 2025 at 09:22:54AM +0100, Amir Goldstein wrote:
> >
> > > I don't think there is a point in optimizing parallel dir operations
> > > with FUSE server cache invalidation, but maybe I am missing
> > > something.
> >
> > The interesting part is the expected semantics of operation;
> > d_invalidate() side definitely doesn't need any of that cruft,
> > but I would really like to understand what that function
> > is supposed to do.
> >
> > Miklos, could you post a brain dump on that?
>
> This function is supposed to invalidate a dentry due to remote changes
> (FUSE_NOTIFY_INVAL_ENTRY). Originally it was supplied a parent ID and
> a name and called d_invalidate() on the looked up dentry.
>
> Then it grew a variant (FUSE_NOTIFY_DELETE) that was also supplied a
> child ID, which was matched against the looked up inode. This was
> commit 451d0f599934 ("FUSE: Notifying the kernel of deletion."),
> Apparently this worked around the fact that at that time
> d_invalidate() returned -EBUSY if the target was still in use and
> didn't unhash the dentry in that case.
>
> That was later changed by commit bafc9b754f75 ("vfs: More precise
> tests in d_invalidate") to unconditionally unhash the target, which
> effectively made FUSE_NOTIFY_INVAL_ENTRY and FUSE_NOTIFY_DELETE
> equivalent and the code in question unnecessary.
>
> For the future, we could also introduce FUSE_NOTIFY_MOVE, that would
> differentiate between a delete and a move, while
> FUSE_NOTIFY_INVAL_ENTRY would continue to be the common (deleted or
> moved) notification.
Then as far as VFS is concerned, it's an equivalent of "we'd done
a dcache lookup and revalidate told us to bugger off", which does
*not* need locking the parent - the same sequence can very well
happen without touching any inode locks.
IOW, from the point of view of locking protocol changes that's not
a removal at all.
Or do you need them serialized for fuse-internal purposes?
© 2016 - 2025 Red Hat, Inc.