[RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free

Helge Deller posted 1 patch 2 months, 2 weeks ago
[RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Helge Deller 2 months, 2 weeks ago
The debian buildd servers for the parisc architecture crash reproduceably when
building the webkit2gtk debian package, shortly after having shown the warning
below.

This patch keeps the lock of the dentry up until when the dentry is given back
to the cache and after having freed the "external dentry name".

I'm not sure if this patch is really correct, but it seems to have fixed the
problem, although more testing is needed.

 kernel: WARNING: CPU: 2 PID: 65 at fs/dcache.c:430 dentry_free+0x15c/0x188
 kernel: Modules linked in: binfmt_misc ipmi_si ipmi_devintf ipmi_msghandler sg configfs nfnetlink autofs4 ex>
 kernel: CPU: 2 UID: 0 PID: 65 Comm: kswapd0 Tainted: G        W  6.18.15+deb14-parisc64 #1 NONE  De>
 kernel: Hardware name: 9000/800/rp4440
 kernel:
 kernel:  IAOQ[0]: dentry_free+0x15c/0x188
 kernel:  IAOQ[1]: dentry_free+0x160/0x188
 kernel:  RP(r2): __dentry_kill+0x2a4/0x338
 kernel: Backtrace:
 kernel:  [<00000000408d9eb0>] __dentry_kill+0x2a4/0x338
 kernel:  [<00000000408dc270>] shrink_dentry_list+0xfc/0x1d0
 kernel:  [<00000000408dcae4>] prune_dcache_sb+0x88/0xc0
 kernel:  [<00000000408a6410>] super_cache_scan+0x2bc/0x440
 kernel:  [<0000000040742f38>] do_shrink_slab+0x254/0x610
 kernel:  [<00000000407449b4>] shrink_slab+0x4d8/0x860
 kernel:  [<000000004073948c>] shrink_one+0x108/0x468
 kernel:  [<000000004073f270>] shrink_node+0xfdc/0x17a0
 kernel:  [<000000004074016c>] balance_pgdat+0x738/0xfb0
 kernel:  [<0000000040740da4>] kswapd+0x3c0/0x788
 kernel:  [<00000000403b35cc>] kthread+0x230/0x430
 kernel:  [<000000004031d020>] ret_from_kernel_thread+0x20/0x28

Signed-off-by: Helge Deller <deller@gmx.de>


diff --git a/fs/dcache.c b/fs/dcache.c
index 7ba1801d8132..c1123787d3bd 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -343,6 +343,7 @@ static void __d_free(struct rcu_head *head)
 {
 	struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu);
 
+	spin_unlock(&dentry->d_lock);
 	kmem_cache_free(dentry_cache, dentry); 
 }
 
@@ -350,6 +351,7 @@ static void __d_free_external(struct rcu_head *head)
 {
 	struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu);
 	kfree(external_name(dentry));
+	spin_unlock(&dentry->d_lock);
 	kmem_cache_free(dentry_cache, dentry);
 }
 
@@ -684,9 +686,10 @@ static struct dentry *__dentry_kill(struct dentry *dentry)
 	dentry_unlist(dentry);
 	if (dentry->d_flags & DCACHE_SHRINK_LIST)
 		can_free = false;
-	spin_unlock(&dentry->d_lock);
 	if (likely(can_free))
 		dentry_free(dentry);
+	else
+		spin_unlock(&dentry->d_lock);
 	if (parent && --parent->d_lockref.count) {
 		spin_unlock(&parent->d_lock);
 		return NULL;
@@ -1165,9 +1168,10 @@ void shrink_dentry_list(struct list_head *list)
 			rcu_read_unlock();
 			d_shrink_del(dentry);
 			can_free = dentry->d_flags & DCACHE_DENTRY_KILLED;
-			spin_unlock(&dentry->d_lock);
 			if (can_free)
 				dentry_free(dentry);
+			else
+				spin_unlock(&dentry->d_lock);
 			continue;
 		}
 		d_shrink_del(dentry);
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Al Viro 2 months, 2 weeks ago
On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote:
> The debian buildd servers for the parisc architecture crash reproduceably when
> building the webkit2gtk debian package, shortly after having shown the warning
> below.
> 
> This patch keeps the lock of the dentry up until when the dentry is given back
> to the cache and after having freed the "external dentry name".
> 
> I'm not sure if this patch is really correct, but it seems to have fixed the
> problem, although more testing is needed.

Hard NAK.  You are turning every place that grabs ->d_lock on a dentry scheduled
for freeing (like, say it, any RCU pathwalk trying to check if the end result can
be grabbed) into a UAF.

Do you have a better localized reproducer?
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Al Viro 2 months, 2 weeks ago
On Mon, Apr 06, 2026 at 09:07:33PM +0100, Al Viro wrote:
> On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote:
> > The debian buildd servers for the parisc architecture crash reproduceably when
> > building the webkit2gtk debian package, shortly after having shown the warning
> > below.
> > 
> > This patch keeps the lock of the dentry up until when the dentry is given back
> > to the cache and after having freed the "external dentry name".
> > 
> > I'm not sure if this patch is really correct, but it seems to have fixed the
> > problem, although more testing is needed.
> 
> Hard NAK.  You are turning every place that grabs ->d_lock on a dentry scheduled
> for freeing (like, say it, any RCU pathwalk trying to check if the end result can
> be grabbed) into a UAF.
> 
> Do you have a better localized reproducer?

BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait?  It's possible
that changes in there might accidentally fix that, and if they did it would narrow
the things down a lot.

Some invariants that ought to hold:
	1) dentry_free() should never be called without DCACHE_DENTRY_KILLED
	2) DCACHE_DENTRY_KILLED should never be set on positive dentries
	3) DCACHE_DENTRY_KILLED | DCACHE_PAR_LOOKUP is only possible for
dentries that had never been inserted into ->d_in_lookup_hash
	4) dentry with DCACHE_DENTRY_KILLED should never become positive

Could you turn that
	WARN_ON(!hlist_unhashed(&dentry->d_alias));
in whatever you'd been testing into
	if (WARN_ON(!hlist_unhashed(&dentry->d_alias)))
		printk(KERN_ERR "->d_inode = %p, ->d_flags = %x",
			dentry->d_inode, dentry->d_flags);
and see what it shows?  That's a separate from #work.dcache-busy-wait test -
please, do that one on the tree where you'd seen the original bug.
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Helge Deller 2 months, 2 weeks ago
Hi Al,

On 4/6/26 22:28, Al Viro wrote:
> On Mon, Apr 06, 2026 at 09:07:33PM +0100, Al Viro wrote:
>> On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote:
>>> The debian buildd servers for the parisc architecture crash reproduceably when
>>> building the webkit2gtk debian package, shortly after having shown the warning
>>> below.
>>>
>>> This patch keeps the lock of the dentry up until when the dentry is given back
>>> to the cache and after having freed the "external dentry name".
>>>
>>> I'm not sure if this patch is really correct, but it seems to have fixed the
>>> problem, although more testing is needed.
>>
>> Hard NAK.  You are turning every place that grabs ->d_lock on a dentry scheduled
>> for freeing (like, say it, any RCU pathwalk trying to check if the end result can
>> be grabbed) into a UAF.
>>
>> Do you have a better localized reproducer?
> 
> BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait?  It's possible
> that changes in there might accidentally fix that, and if they did it would narrow
> the things down a lot.

Ok, will try.
Please note that building kernel/ installing / running dpkg build takes hours & days,
so it may take quite some time until I come back here....
  
> Some invariants that ought to hold:
> 	1) dentry_free() should never be called without DCACHE_DENTRY_KILLED
> 	2) DCACHE_DENTRY_KILLED should never be set on positive dentries
> 	3) DCACHE_DENTRY_KILLED | DCACHE_PAR_LOOKUP is only possible for
> dentries that had never been inserted into ->d_in_lookup_hash
> 	4) dentry with DCACHE_DENTRY_KILLED should never become positive
> 
> Could you turn that
> 	WARN_ON(!hlist_unhashed(&dentry->d_alias));
> in whatever you'd been testing into
> 	if (WARN_ON(!hlist_unhashed(&dentry->d_alias)))
> 		printk(KERN_ERR "->d_inode = %p, ->d_flags = %x",
> 			dentry->d_inode, dentry->d_flags);
> and see what it shows?  That's a separate from #work.dcache-busy-wait test -
> please, do that one on the tree where you'd seen the original bug.

Ok.

Thanks!
Helge
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Al Viro 2 months, 2 weeks ago
On Mon, Apr 06, 2026 at 10:43:57PM +0200, Helge Deller wrote:

> > BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait?  It's possible
> > that changes in there might accidentally fix that, and if they did it would narrow
> > the things down a lot.
> 
> Ok, will try.
> Please note that building kernel/ installing / running dpkg build takes hours & days,
> so it may take quite some time until I come back here....

Which kernel(s) had that been reproduced on, BTW?

Incidentally, can that be reproduced on any of qemu-based setups?  And do you need
to build the kernel natively?  Debian kernels can be cross-built, after all...
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Helge Deller 2 months, 2 weeks ago
On 4/6/26 23:10, Al Viro wrote:
> On Mon, Apr 06, 2026 at 10:43:57PM +0200, Helge Deller wrote:
> 
>>> BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait?  It's possible
>>> that changes in there might accidentally fix that, and if they did it would narrow
>>> the things down a lot.
>>
>> Ok, will try.
>> Please note that building kernel/ installing / running dpkg build takes hours & days,
>> so it may take quite some time until I come back here....
> 
> Which kernel(s) had that been reproduced on, BTW?

This happened with all Debian kernels from (at least, maybe earlier) 6.16 up to 6.19.
  
> Incidentally, can that be reproduced on any of qemu-based setups?  And do you need
> to build the kernel natively?  Debian kernels can be cross-built, after all...
I know, and I do in some cases.
Nevertheless, everthing takes often longer than expected... :-(

Helge
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Helge Deller 2 months, 2 weeks ago
Hi Al,

On 4/6/26 22:07, Al Viro wrote:
> On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote:
>> The debian buildd servers for the parisc architecture crash reproduceably when
>> building the webkit2gtk debian package, shortly after having shown the warning
>> below.
>>
>> This patch keeps the lock of the dentry up until when the dentry is given back
>> to the cache and after having freed the "external dentry name".
>>
>> I'm not sure if this patch is really correct, but it seems to have fixed the
>> problem, although more testing is needed.
> 
> Hard NAK.  You are turning every place that grabs ->d_lock on a dentry scheduled
> for freeing (like, say it, any RCU pathwalk trying to check if the end result can
> be grabbed) into a UAF.

Thanks for looking into the patch!
I assume UAF means User-after-free?
As I'm not an expert here, could you please point me to where
this use-after-free happens?
The kfree() is used on the external dentry name, and the lock is
unlocked before calling kmem_cache_free(), so I'd not expect that I
introduced an UAF here. But of course I could be wrong....
  
> Do you have a better localized reproducer?

Sadly not yet. I will try, but since the package is huge and the machines are
relatively slow it's not easy to track down.

Thanks!
Helge
Re: [RFC] [PATCH] Fix warning at fs/dcache.c:430 dentry_free
Posted by Al Viro 2 months, 2 weeks ago
On Mon, Apr 06, 2026 at 10:21:17PM +0200, Helge Deller wrote:
> Hi Al,
> 
> On 4/6/26 22:07, Al Viro wrote:
> > On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote:
> > > The debian buildd servers for the parisc architecture crash reproduceably when
> > > building the webkit2gtk debian package, shortly after having shown the warning
> > > below.
> > > 
> > > This patch keeps the lock of the dentry up until when the dentry is given back
> > > to the cache and after having freed the "external dentry name".
> > > 
> > > I'm not sure if this patch is really correct, but it seems to have fixed the
> > > problem, although more testing is needed.
> > 
> > Hard NAK.  You are turning every place that grabs ->d_lock on a dentry scheduled
> > for freeing (like, say it, any RCU pathwalk trying to check if the end result can
> > be grabbed) into a UAF.
> 
> Thanks for looking into the patch!
> I assume UAF means User-after-free?
> As I'm not an expert here, could you please point me to where
> this use-after-free happens?
> The kfree() is used on the external dentry name, and the lock is
> unlocked before calling kmem_cache_free(), so I'd not expect that I
> introduced an UAF here. But of course I could be wrong....

s/UAF/deadlock/, actually.

A:	rcu_read_lock();
A:	find a dentry (lockless)
B:	grab dentry->d_lock
B:	dentry_free(dentry);
B:		call_rcu(..., __d_free) (or __d_free_external - whatever)
A:	grab dentry->d_lock, so we could verify that it's still live

A spins until __d_free() unlocks the sucker, which is not going to be called
until A does rcu_read_unlock().