The debian buildd servers for the parisc architecture crash reproduceably when
building the webkit2gtk debian package, shortly after having shown the warning
below.
This patch keeps the lock of the dentry up until when the dentry is given back
to the cache and after having freed the "external dentry name".
I'm not sure if this patch is really correct, but it seems to have fixed the
problem, although more testing is needed.
kernel: WARNING: CPU: 2 PID: 65 at fs/dcache.c:430 dentry_free+0x15c/0x188
kernel: Modules linked in: binfmt_misc ipmi_si ipmi_devintf ipmi_msghandler sg configfs nfnetlink autofs4 ex>
kernel: CPU: 2 UID: 0 PID: 65 Comm: kswapd0 Tainted: G W 6.18.15+deb14-parisc64 #1 NONE De>
kernel: Hardware name: 9000/800/rp4440
kernel:
kernel: IAOQ[0]: dentry_free+0x15c/0x188
kernel: IAOQ[1]: dentry_free+0x160/0x188
kernel: RP(r2): __dentry_kill+0x2a4/0x338
kernel: Backtrace:
kernel: [<00000000408d9eb0>] __dentry_kill+0x2a4/0x338
kernel: [<00000000408dc270>] shrink_dentry_list+0xfc/0x1d0
kernel: [<00000000408dcae4>] prune_dcache_sb+0x88/0xc0
kernel: [<00000000408a6410>] super_cache_scan+0x2bc/0x440
kernel: [<0000000040742f38>] do_shrink_slab+0x254/0x610
kernel: [<00000000407449b4>] shrink_slab+0x4d8/0x860
kernel: [<000000004073948c>] shrink_one+0x108/0x468
kernel: [<000000004073f270>] shrink_node+0xfdc/0x17a0
kernel: [<000000004074016c>] balance_pgdat+0x738/0xfb0
kernel: [<0000000040740da4>] kswapd+0x3c0/0x788
kernel: [<00000000403b35cc>] kthread+0x230/0x430
kernel: [<000000004031d020>] ret_from_kernel_thread+0x20/0x28
Signed-off-by: Helge Deller <deller@gmx.de>
diff --git a/fs/dcache.c b/fs/dcache.c
index 7ba1801d8132..c1123787d3bd 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -343,6 +343,7 @@ static void __d_free(struct rcu_head *head)
{
struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu);
+ spin_unlock(&dentry->d_lock);
kmem_cache_free(dentry_cache, dentry);
}
@@ -350,6 +351,7 @@ static void __d_free_external(struct rcu_head *head)
{
struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu);
kfree(external_name(dentry));
+ spin_unlock(&dentry->d_lock);
kmem_cache_free(dentry_cache, dentry);
}
@@ -684,9 +686,10 @@ static struct dentry *__dentry_kill(struct dentry *dentry)
dentry_unlist(dentry);
if (dentry->d_flags & DCACHE_SHRINK_LIST)
can_free = false;
- spin_unlock(&dentry->d_lock);
if (likely(can_free))
dentry_free(dentry);
+ else
+ spin_unlock(&dentry->d_lock);
if (parent && --parent->d_lockref.count) {
spin_unlock(&parent->d_lock);
return NULL;
@@ -1165,9 +1168,10 @@ void shrink_dentry_list(struct list_head *list)
rcu_read_unlock();
d_shrink_del(dentry);
can_free = dentry->d_flags & DCACHE_DENTRY_KILLED;
- spin_unlock(&dentry->d_lock);
if (can_free)
dentry_free(dentry);
+ else
+ spin_unlock(&dentry->d_lock);
continue;
}
d_shrink_del(dentry);
On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote: > The debian buildd servers for the parisc architecture crash reproduceably when > building the webkit2gtk debian package, shortly after having shown the warning > below. > > This patch keeps the lock of the dentry up until when the dentry is given back > to the cache and after having freed the "external dentry name". > > I'm not sure if this patch is really correct, but it seems to have fixed the > problem, although more testing is needed. Hard NAK. You are turning every place that grabs ->d_lock on a dentry scheduled for freeing (like, say it, any RCU pathwalk trying to check if the end result can be grabbed) into a UAF. Do you have a better localized reproducer?
On Mon, Apr 06, 2026 at 09:07:33PM +0100, Al Viro wrote: > On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote: > > The debian buildd servers for the parisc architecture crash reproduceably when > > building the webkit2gtk debian package, shortly after having shown the warning > > below. > > > > This patch keeps the lock of the dentry up until when the dentry is given back > > to the cache and after having freed the "external dentry name". > > > > I'm not sure if this patch is really correct, but it seems to have fixed the > > problem, although more testing is needed. > > Hard NAK. You are turning every place that grabs ->d_lock on a dentry scheduled > for freeing (like, say it, any RCU pathwalk trying to check if the end result can > be grabbed) into a UAF. > > Do you have a better localized reproducer? BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait? It's possible that changes in there might accidentally fix that, and if they did it would narrow the things down a lot. Some invariants that ought to hold: 1) dentry_free() should never be called without DCACHE_DENTRY_KILLED 2) DCACHE_DENTRY_KILLED should never be set on positive dentries 3) DCACHE_DENTRY_KILLED | DCACHE_PAR_LOOKUP is only possible for dentries that had never been inserted into ->d_in_lookup_hash 4) dentry with DCACHE_DENTRY_KILLED should never become positive Could you turn that WARN_ON(!hlist_unhashed(&dentry->d_alias)); in whatever you'd been testing into if (WARN_ON(!hlist_unhashed(&dentry->d_alias))) printk(KERN_ERR "->d_inode = %p, ->d_flags = %x", dentry->d_inode, dentry->d_flags); and see what it shows? That's a separate from #work.dcache-busy-wait test - please, do that one on the tree where you'd seen the original bug.
Hi Al, On 4/6/26 22:28, Al Viro wrote: > On Mon, Apr 06, 2026 at 09:07:33PM +0100, Al Viro wrote: >> On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote: >>> The debian buildd servers for the parisc architecture crash reproduceably when >>> building the webkit2gtk debian package, shortly after having shown the warning >>> below. >>> >>> This patch keeps the lock of the dentry up until when the dentry is given back >>> to the cache and after having freed the "external dentry name". >>> >>> I'm not sure if this patch is really correct, but it seems to have fixed the >>> problem, although more testing is needed. >> >> Hard NAK. You are turning every place that grabs ->d_lock on a dentry scheduled >> for freeing (like, say it, any RCU pathwalk trying to check if the end result can >> be grabbed) into a UAF. >> >> Do you have a better localized reproducer? > > BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait? It's possible > that changes in there might accidentally fix that, and if they did it would narrow > the things down a lot. Ok, will try. Please note that building kernel/ installing / running dpkg build takes hours & days, so it may take quite some time until I come back here.... > Some invariants that ought to hold: > 1) dentry_free() should never be called without DCACHE_DENTRY_KILLED > 2) DCACHE_DENTRY_KILLED should never be set on positive dentries > 3) DCACHE_DENTRY_KILLED | DCACHE_PAR_LOOKUP is only possible for > dentries that had never been inserted into ->d_in_lookup_hash > 4) dentry with DCACHE_DENTRY_KILLED should never become positive > > Could you turn that > WARN_ON(!hlist_unhashed(&dentry->d_alias)); > in whatever you'd been testing into > if (WARN_ON(!hlist_unhashed(&dentry->d_alias))) > printk(KERN_ERR "->d_inode = %p, ->d_flags = %x", > dentry->d_inode, dentry->d_flags); > and see what it shows? That's a separate from #work.dcache-busy-wait test - > please, do that one on the tree where you'd seen the original bug. Ok. Thanks! Helge
On Mon, Apr 06, 2026 at 10:43:57PM +0200, Helge Deller wrote: > > BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait? It's possible > > that changes in there might accidentally fix that, and if they did it would narrow > > the things down a lot. > > Ok, will try. > Please note that building kernel/ installing / running dpkg build takes hours & days, > so it may take quite some time until I come back here.... Which kernel(s) had that been reproduced on, BTW? Incidentally, can that be reproduced on any of qemu-based setups? And do you need to build the kernel natively? Debian kernels can be cross-built, after all...
On 4/6/26 23:10, Al Viro wrote: > On Mon, Apr 06, 2026 at 10:43:57PM +0200, Helge Deller wrote: > >>> BTW, could you reproduce it on viro/vfs.git #work.dcache-busy-wait? It's possible >>> that changes in there might accidentally fix that, and if they did it would narrow >>> the things down a lot. >> >> Ok, will try. >> Please note that building kernel/ installing / running dpkg build takes hours & days, >> so it may take quite some time until I come back here.... > > Which kernel(s) had that been reproduced on, BTW? This happened with all Debian kernels from (at least, maybe earlier) 6.16 up to 6.19. > Incidentally, can that be reproduced on any of qemu-based setups? And do you need > to build the kernel natively? Debian kernels can be cross-built, after all... I know, and I do in some cases. Nevertheless, everthing takes often longer than expected... :-( Helge
Hi Al, On 4/6/26 22:07, Al Viro wrote: > On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote: >> The debian buildd servers for the parisc architecture crash reproduceably when >> building the webkit2gtk debian package, shortly after having shown the warning >> below. >> >> This patch keeps the lock of the dentry up until when the dentry is given back >> to the cache and after having freed the "external dentry name". >> >> I'm not sure if this patch is really correct, but it seems to have fixed the >> problem, although more testing is needed. > > Hard NAK. You are turning every place that grabs ->d_lock on a dentry scheduled > for freeing (like, say it, any RCU pathwalk trying to check if the end result can > be grabbed) into a UAF. Thanks for looking into the patch! I assume UAF means User-after-free? As I'm not an expert here, could you please point me to where this use-after-free happens? The kfree() is used on the external dentry name, and the lock is unlocked before calling kmem_cache_free(), so I'd not expect that I introduced an UAF here. But of course I could be wrong.... > Do you have a better localized reproducer? Sadly not yet. I will try, but since the package is huge and the machines are relatively slow it's not easy to track down. Thanks! Helge
On Mon, Apr 06, 2026 at 10:21:17PM +0200, Helge Deller wrote: > Hi Al, > > On 4/6/26 22:07, Al Viro wrote: > > On Mon, Apr 06, 2026 at 09:52:16PM +0200, Helge Deller wrote: > > > The debian buildd servers for the parisc architecture crash reproduceably when > > > building the webkit2gtk debian package, shortly after having shown the warning > > > below. > > > > > > This patch keeps the lock of the dentry up until when the dentry is given back > > > to the cache and after having freed the "external dentry name". > > > > > > I'm not sure if this patch is really correct, but it seems to have fixed the > > > problem, although more testing is needed. > > > > Hard NAK. You are turning every place that grabs ->d_lock on a dentry scheduled > > for freeing (like, say it, any RCU pathwalk trying to check if the end result can > > be grabbed) into a UAF. > > Thanks for looking into the patch! > I assume UAF means User-after-free? > As I'm not an expert here, could you please point me to where > this use-after-free happens? > The kfree() is used on the external dentry name, and the lock is > unlocked before calling kmem_cache_free(), so I'd not expect that I > introduced an UAF here. But of course I could be wrong.... s/UAF/deadlock/, actually. A: rcu_read_lock(); A: find a dentry (lockless) B: grab dentry->d_lock B: dentry_free(dentry); B: call_rcu(..., __d_free) (or __d_free_external - whatever) A: grab dentry->d_lock, so we could verify that it's still live A spins until __d_free() unlocks the sucker, which is not going to be called until A does rcu_read_unlock().
© 2016 - 2026 Red Hat, Inc.