fs/cachefiles/namei.c | 1 - 1 file changed, 1 deletion(-)
When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the
former's ref on the victim dentry that it obtained from
cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput
of the victim in place, resulting in occasional:
WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831
cachefiles_cull+0x8c/0xe0 [cachefiles]
cachefiles_daemon_cull+0xcd/0x120 [cachefiles]
cachefiles_daemon_write+0x14e/0x1d0 [cachefiles]
vfs_write+0xc3/0x480
...
reports.
Fix this by removing the dput().
Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: NeilBrown <neil@brown.name>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index bdac2f33edf3..e2023e78e4df 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -795,7 +795,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
ret = cachefiles_bury_object(cache, NULL, dir, victim,
FSCACHE_OBJECT_WAS_CULLED);
- dput(victim);
if (ret < 0)
goto error;
David Howells <dhowells@redhat.com> wrote:
> Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
Actually, this should probably be:
Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
On Tue, Mar 24, 2026 at 7:50 PM David Howells <dhowells@redhat.com> wrote:
>
> David Howells <dhowells@redhat.com> wrote:
>
> > Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()")
>
> Actually, this should probably be:
>
> Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
I think it is the correct Fixes tag, but I'm not sure that this is
actually the right fix. 7bb1eb45e43c switched other callers of
cachefiles_bury_object to use start_removing_dentry, which gets an
additional ref, and removed the extra dget from
cachefiles_bury_object. In the cachefiles_cull case however, the
dentry is from start_removing and has a single ref on entry to
cachefiles_bury_object, which is an issue as "rep" may be used there
after end_removing may have put the last ref. So the correct is
probably for cachefiles_cull to add a dget() before the call to
cachefiles_bury_object.
Marc
Marc Dionne <marc.c.dionne@gmail.com> wrote: > I think it is the correct Fixes tag, but I'm not sure that this is > actually the right fix. 7bb1eb45e43c switched other callers of > cachefiles_bury_object to use start_removing_dentry, which gets an > additional ref, and removed the extra dget from > cachefiles_bury_object. In the cachefiles_cull case however, the > dentry is from start_removing and has a single ref on entry to > cachefiles_bury_object, which is an issue as "rep" may be used there > after end_removing may have put the last ref. So the correct is > probably for cachefiles_cull to add a dget() before the call to > cachefiles_bury_object. Ugh. You're right. The problem is that we're calling start_removing() without knowing whether we can just unlink the object. I wonder if I need to do the lookup in cachefiles_lookup_for_cull() and only then call start_removing_dentry() if it's not a directory (directories get moved to the graveyard for cachefilesd to tear down). I think the right solution is actually to move start_removing_dentry() down into cachefiles_bury_object() and make it contingent on the dentry being a non-dir. David
On Thu, 26 Mar 2026, David Howells wrote: > Marc Dionne <marc.c.dionne@gmail.com> wrote: > > > I think it is the correct Fixes tag, but I'm not sure that this is > > actually the right fix. 7bb1eb45e43c switched other callers of > > cachefiles_bury_object to use start_removing_dentry, which gets an > > additional ref, and removed the extra dget from > > cachefiles_bury_object. In the cachefiles_cull case however, the > > dentry is from start_removing and has a single ref on entry to > > cachefiles_bury_object, which is an issue as "rep" may be used there > > after end_removing may have put the last ref. So the correct is > > probably for cachefiles_cull to add a dget() before the call to > > cachefiles_bury_object. > > Ugh. You're right. > > The problem is that we're calling start_removing() without knowing whether we > can just unlink the object. I wonder if I need to do the lookup in > cachefiles_lookup_for_cull() and only then call start_removing_dentry() if > it's not a directory (directories get moved to the graveyard for cachefilesd > to tear down). > > I think the right solution is actually to move start_removing_dentry() down > into cachefiles_bury_object() and make it contingent on the dentry being a > non-dir. > > David > > cachesfiles_bury_object() has a comment saying: * On entry there must be at least 2 refs on rep, one will be dropped on exit. and this is consistent with the code in that function. It is called from 3 places. - cachefiles_invalidate_cookie(), cachesfiles_look_up_object(), and cachefiles_acquire_volume() all precede it with a start_removing_dentry() which results in 2 references to the dentry (the original and and extra which it takes) - so that fits with the comment. - cachesfiles_cull() preceeds it with cachesfiles_lookup_for_cull() which uses start_removing() which returns with 1 reference to the dentry. As the dentry didn't pre-exist, there is only one ref. So this is incorrect. cachesfiles_cull() needs to take an extra reference to victim so that when cachefiles_busy_object() calls end_removing, it still has a valid reference. So I think --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -781,7 +781,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, if (ret < 0) goto error_unlock; - ret = cachefiles_bury_object(cache, NULL, dir, victim, + ret = cachefiles_bury_object(cache, NULL, dir, dget(victim), FSCACHE_OBJECT_WAS_CULLED); dput(victim); if (ret < 0) would be a correct fix. If you agree I can post a properly formated patch which explanation. Thanks, NeilBrown
NeilBrown <neilb@ownmail.net> wrote: > - ret = cachefiles_bury_object(cache, NULL, dir, victim, > + ret = cachefiles_bury_object(cache, NULL, dir, dget(victim), I would prefer the dget() to be on a line on its own before this one to make it easier to spot. > If you agree I can post a properly formated patch which explanation. That would be great! Thanks, David
When cachefiles_cull() calls cachefiles_bury_object(), the latter eats the
former's ref on the victim dentry that it obtained from
cachefiles_lookup_for_cull(). However, commit 7bb1eb45e43c left the dput
of the victim in place, resulting in occasional:
WARNING: fs/dcache.c:829 at dput.part.0+0xf5/0x110, CPU#7: cachefilesd/11831
cachefiles_cull+0x8c/0xe0 [cachefiles]
cachefiles_daemon_cull+0xcd/0x120 [cachefiles]
cachefiles_daemon_write+0x14e/0x1d0 [cachefiles]
vfs_write+0xc3/0x480
...
reports.
Actually, it's worse than that: cachefiles_bury_object() eats the ref it
was given - and then may continue to access the now-unref'd dentry it if it
turns out to be a directory. So simply removing the aberrant dput() is not
sufficient.
Fix this by making cachefiles_bury_object() retain the ref itself around
end_removing() if it needs to keep it and then drop the ref before returning.
Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: NeilBrown <neil@brown.name>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 36 +++++++++++++++++++++---------------
1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index e5ec90dccc27..20138309733f 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -287,14 +287,14 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
if (!d_is_dir(rep)) {
ret = cachefiles_unlink(cache, object, dir, rep, why);
end_removing(rep);
-
_leave(" = %d", ret);
return ret;
}
/* directories have to be moved to the graveyard */
_debug("move stale object to graveyard");
- end_removing(rep);
+ dget(rep);
+ end_removing(rep); /* Drops ref on rep */
try_again:
/* first step is to make up a grave dentry in the graveyard */
@@ -304,8 +304,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
/* do the multiway lock magic */
trap = lock_rename(cache->graveyard, dir);
- if (IS_ERR(trap))
- return PTR_ERR(trap);
+ if (IS_ERR(trap)) {
+ ret = PTR_ERR(trap);
+ goto out;
+ }
/* do some checks before getting the grave dentry */
if (rep->d_parent != dir || IS_DEADDIR(d_inode(rep))) {
@@ -313,25 +315,27 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
* lock */
unlock_rename(cache->graveyard, dir);
_leave(" = 0 [culled?]");
- return 0;
+ ret = 0;
+ goto out;
}
+ ret = -EIO;
if (!d_can_lookup(cache->graveyard)) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "Graveyard no longer a directory");
- return -EIO;
+ goto out;
}
if (trap == rep) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "May not make directory loop");
- return -EIO;
+ goto out;
}
if (d_mountpoint(rep)) {
unlock_rename(cache->graveyard, dir);
cachefiles_io_error(cache, "Mountpoint in cache");
- return -EIO;
+ goto out;
}
grave = lookup_one(&nop_mnt_idmap, &QSTR(nbuffer), cache->graveyard);
@@ -343,11 +347,12 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
if (PTR_ERR(grave) == -ENOMEM) {
_leave(" = -ENOMEM");
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto out;
}
cachefiles_io_error(cache, "Lookup error %ld", PTR_ERR(grave));
- return -EIO;
+ goto out;
}
if (d_is_positive(grave)) {
@@ -362,7 +367,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
unlock_rename(cache->graveyard, dir);
dput(grave);
cachefiles_io_error(cache, "Mountpoint in graveyard");
- return -EIO;
+ goto out;
}
/* target should not be an ancestor of source */
@@ -370,7 +375,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
unlock_rename(cache->graveyard, dir);
dput(grave);
cachefiles_io_error(cache, "May not make directory loop");
- return -EIO;
+ goto out;
}
/* attempt the rename */
@@ -404,8 +409,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
__cachefiles_unmark_inode_in_use(object, d_inode(rep));
unlock_rename(cache->graveyard, dir);
dput(grave);
- _leave(" = 0");
- return 0;
+ _leave(" = %d", ret);
+out:
+ dput(rep);
+ return ret;
}
/*
@@ -812,7 +819,6 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
ret = cachefiles_bury_object(cache, NULL, dir, victim,
FSCACHE_OBJECT_WAS_CULLED);
- dput(victim);
if (ret < 0)
goto error;
© 2016 - 2026 Red Hat, Inc.