[PATCH] fuse: fix inode initialization race

Horst Birthelmer posted 1 patch 2 weeks, 5 days ago
There is a newer version of this series
fs/fuse/fuse_i.h | 3 +++
fs/fuse/inode.c  | 8 ++++++++
2 files changed, 11 insertions(+)
[PATCH] fuse: fix inode initialization race
Posted by Horst Birthelmer 2 weeks, 5 days ago
From: Horst Birthelmer <hbirthelmer@ddn.com>

Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
invalidation can arrive while an inode is being initialized, causing
the invalidation to be lost.

Add a waitqueue to make fuse_reverse_inval_inode() wait when it
encounters an inode with attr_version == 0 (still initializing).
When fuse_change_attributes_common() completes initialization, it
wakes waiting threads.

This ensures invalidations are properly serialized with inode
initialization, maintaining cache coherency.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
---
 fs/fuse/fuse_i.h | 3 +++
 fs/fuse/inode.c  | 8 ++++++++
 2 files changed, 11 insertions(+)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 7f16049387d15e869db4be23a93605098588eda9..1be611472eee276371b3bde1a55257c1116cfedd 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -945,6 +945,9 @@ struct fuse_conn {
 	/** Version counter for attribute changes */
 	atomic64_t attr_version;
 
+	/** Waitqueue for attr_version initialization */
+	wait_queue_head_t attr_version_waitq;
+
 	/** Version counter for evict inode */
 	atomic64_t evict_ctr;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index e57b8af06be93ecc29c58864a9c9e99c68e3283b..c6e7e50d80c0edaea57d9342869eaf811786e342 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -246,6 +246,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
 		set_mask_bits(&fi->inval_mask, STATX_BASIC_STATS, 0);
 
 	fi->attr_version = atomic64_inc_return(&fc->attr_version);
+	wake_up_all(&fc->attr_version_waitq);
 	fi->i_time = attr_valid;
 
 	inode->i_ino     = fuse_squash_ino(attr->ino);
@@ -567,6 +568,12 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
 
 	fi = get_fuse_inode(inode);
 	spin_lock(&fi->lock);
+	while (fi->attr_version == 0) {
+		spin_unlock(&fi->lock);
+		wait_event(fc->attr_version_waitq, READ_ONCE(fi->attr_version) != 0);
+		spin_lock(&fi->lock);
+	}
+
 	fi->attr_version = atomic64_inc_return(&fc->attr_version);
 	spin_unlock(&fi->lock);
 
@@ -979,6 +986,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm,
 	atomic_set(&fc->epoch, 1);
 	INIT_WORK(&fc->epoch_work, fuse_epoch_work);
 	init_waitqueue_head(&fc->blocked_waitq);
+	init_waitqueue_head(&fc->attr_version_waitq);
 	fuse_iqueue_init(&fc->iq, fiq_ops, fiq_priv);
 	INIT_LIST_HEAD(&fc->bg_queue);
 	INIT_LIST_HEAD(&fc->entry);

---
base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
change-id: 20260318-fix-inode-init-race-a47a7ba4af1e

Best regards,
-- 
Horst Birthelmer <hbirthelmer@ddn.com>
Re: [PATCH] fuse: fix inode initialization race
Posted by Miklos Szeredi 1 week, 4 days ago
On Wed, 18 Mar 2026 at 14:45, Horst Birthelmer <horst@birthelmer.com> wrote:
>
> From: Horst Birthelmer <hbirthelmer@ddn.com>
>
> Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
> invalidation can arrive while an inode is being initialized, causing
> the invalidation to be lost.
>
> Add a waitqueue to make fuse_reverse_inval_inode() wait when it
> encounters an inode with attr_version == 0 (still initializing).
> When fuse_change_attributes_common() completes initialization, it
> wakes waiting threads.

This should be relatively rare, right?  In that case a single global
waitq and wake_up_all() would be better, imo.

Thanks,
Miklos
Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Horst Birthelmer 1 week, 4 days ago
On Thu, Mar 26, 2026 at 03:51:18PM +0100, Miklos Szeredi wrote:
> On Wed, 18 Mar 2026 at 14:45, Horst Birthelmer <horst@birthelmer.com> wrote:
> >
> > From: Horst Birthelmer <hbirthelmer@ddn.com>
> >
> > Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
> > invalidation can arrive while an inode is being initialized, causing
> > the invalidation to be lost.
> >
> > Add a waitqueue to make fuse_reverse_inval_inode() wait when it
> > encounters an inode with attr_version == 0 (still initializing).
> > When fuse_change_attributes_common() completes initialization, it
> > wakes waiting threads.
> 
> This should be relatively rare, right?  In that case a single global
> waitq and wake_up_all() would be better, imo.

Well it depends on the use case. We send relatively many notifications
since they are bound to the DLM system and thus to changes done by a lot
of clients and so it happens that you get an invalidation while still
creating the inode.

What is wrong with one per connection?

> 
> Thanks,
> Miklos
Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Miklos Szeredi 1 week, 4 days ago
On Thu, 26 Mar 2026 at 15:57, Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Thu, Mar 26, 2026 at 03:51:18PM +0100, Miklos Szeredi wrote:
> > On Wed, 18 Mar 2026 at 14:45, Horst Birthelmer <horst@birthelmer.com> wrote:
> > >
> > > From: Horst Birthelmer <hbirthelmer@ddn.com>
> > >
> > > Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
> > > invalidation can arrive while an inode is being initialized, causing
> > > the invalidation to be lost.
> > >
> > > Add a waitqueue to make fuse_reverse_inval_inode() wait when it
> > > encounters an inode with attr_version == 0 (still initializing).
> > > When fuse_change_attributes_common() completes initialization, it
> > > wakes waiting threads.
> >
> > This should be relatively rare, right?  In that case a single global
> > waitq and wake_up_all() would be better, imo.
>
> Well it depends on the use case. We send relatively many notifications
> since they are bound to the DLM system and thus to changes done by a lot
> of clients and so it happens that you get an invalidation while still
> creating the inode.
>
> What is wrong with one per connection?

It seemed to be something that would be very rarely used, hence having
a waitq_head per fc is not space efficient.

If two such events are likely to collide multiple times per second,
then I have nothing against a per-fc waitq, otherwise a global one
will be just as good.

Thanks,
Miklos
Re: [PATCH] fuse: fix inode initialization race
Posted by Bernd Schubert 1 week, 5 days ago

On 3/18/26 14:43, Horst Birthelmer wrote:
> From: Horst Birthelmer <hbirthelmer@ddn.com>
> 
> Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
> invalidation can arrive while an inode is being initialized, causing
> the invalidation to be lost.
> 
> Add a waitqueue to make fuse_reverse_inval_inode() wait when it
> encounters an inode with attr_version == 0 (still initializing).
> When fuse_change_attributes_common() completes initialization, it
> wakes waiting threads.
> 
> This ensures invalidations are properly serialized with inode
> initialization, maintaining cache coherency.
> 
> Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
> ---
>  fs/fuse/fuse_i.h | 3 +++
>  fs/fuse/inode.c  | 8 ++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 7f16049387d15e869db4be23a93605098588eda9..1be611472eee276371b3bde1a55257c1116cfedd 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -945,6 +945,9 @@ struct fuse_conn {
>  	/** Version counter for attribute changes */
>  	atomic64_t attr_version;
>  
> +	/** Waitqueue for attr_version initialization */
> +	wait_queue_head_t attr_version_waitq;
> +
>  	/** Version counter for evict inode */
>  	atomic64_t evict_ctr;
>  
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index e57b8af06be93ecc29c58864a9c9e99c68e3283b..c6e7e50d80c0edaea57d9342869eaf811786e342 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -246,6 +246,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
>  		set_mask_bits(&fi->inval_mask, STATX_BASIC_STATS, 0);
>  
>  	fi->attr_version = atomic64_inc_return(&fc->attr_version);
> +	wake_up_all(&fc->attr_version_waitq);
>  	fi->i_time = attr_valid;
>  
>  	inode->i_ino     = fuse_squash_ino(attr->ino);
> @@ -567,6 +568,12 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>  
>  	fi = get_fuse_inode(inode);
>  	spin_lock(&fi->lock);
> +	while (fi->attr_version == 0) {
> +		spin_unlock(&fi->lock);
> +		wait_event(fc->attr_version_waitq, READ_ONCE(fi->attr_version) != 0);
> +		spin_lock(&fi->lock);
> +	}
> +
>  	fi->attr_version = atomic64_inc_return(&fc->attr_version);
>  	spin_unlock(&fi->lock);
>  
> @@ -979,6 +986,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm,
>  	atomic_set(&fc->epoch, 1);
>  	INIT_WORK(&fc->epoch_work, fuse_epoch_work);
>  	init_waitqueue_head(&fc->blocked_waitq);
> +	init_waitqueue_head(&fc->attr_version_waitq);
>  	fuse_iqueue_init(&fc->iq, fiq_ops, fiq_priv);
>  	INIT_LIST_HEAD(&fc->bg_queue);
>  	INIT_LIST_HEAD(&fc->entry);
> 
> ---
> base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
> change-id: 20260318-fix-inode-init-race-a47a7ba4af1e
> 
> Best regards,

Had reviewed that DDN internally already. LGTM

Reviewed-by: Bernd Schubert <bernd@bsbernd.com>
Re: [PATCH] fuse: fix inode initialization race
Posted by Christian Brauner 1 week, 4 days ago
On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> 
> 
> On 3/18/26 14:43, Horst Birthelmer wrote:
> > From: Horst Birthelmer <hbirthelmer@ddn.com>
> > 
> > Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
> > invalidation can arrive while an inode is being initialized, causing
> > the invalidation to be lost.
> > 
> > Add a waitqueue to make fuse_reverse_inval_inode() wait when it
> > encounters an inode with attr_version == 0 (still initializing).
> > When fuse_change_attributes_common() completes initialization, it
> > wakes waiting threads.
> > 
> > This ensures invalidations are properly serialized with inode
> > initialization, maintaining cache coherency.
> > 
> > Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
> > ---
> >  fs/fuse/fuse_i.h | 3 +++
> >  fs/fuse/inode.c  | 8 ++++++++
> >  2 files changed, 11 insertions(+)
> > 
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index 7f16049387d15e869db4be23a93605098588eda9..1be611472eee276371b3bde1a55257c1116cfedd 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -945,6 +945,9 @@ struct fuse_conn {
> >  	/** Version counter for attribute changes */
> >  	atomic64_t attr_version;
> >  
> > +	/** Waitqueue for attr_version initialization */
> > +	wait_queue_head_t attr_version_waitq;
> > +
> >  	/** Version counter for evict inode */
> >  	atomic64_t evict_ctr;
> >  
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index e57b8af06be93ecc29c58864a9c9e99c68e3283b..c6e7e50d80c0edaea57d9342869eaf811786e342 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -246,6 +246,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
> >  		set_mask_bits(&fi->inval_mask, STATX_BASIC_STATS, 0);
> >  
> >  	fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > +	wake_up_all(&fc->attr_version_waitq);
> >  	fi->i_time = attr_valid;
> >  
> >  	inode->i_ino     = fuse_squash_ino(attr->ino);
> > @@ -567,6 +568,12 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
> >  
> >  	fi = get_fuse_inode(inode);
> >  	spin_lock(&fi->lock);
> > +	while (fi->attr_version == 0) {
> > +		spin_unlock(&fi->lock);
> > +		wait_event(fc->attr_version_waitq, READ_ONCE(fi->attr_version) != 0);
> > +		spin_lock(&fi->lock);
> > +	}
> > +
> >  	fi->attr_version = atomic64_inc_return(&fc->attr_version);
> >  	spin_unlock(&fi->lock);
> >  
> > @@ -979,6 +986,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm,
> >  	atomic_set(&fc->epoch, 1);
> >  	INIT_WORK(&fc->epoch_work, fuse_epoch_work);
> >  	init_waitqueue_head(&fc->blocked_waitq);
> > +	init_waitqueue_head(&fc->attr_version_waitq);
> >  	fuse_iqueue_init(&fc->iq, fiq_ops, fiq_priv);
> >  	INIT_LIST_HEAD(&fc->bg_queue);
> >  	INIT_LIST_HEAD(&fc->entry);
> > 
> > ---
> > base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
> > change-id: 20260318-fix-inode-init-race-a47a7ba4af1e
> > 
> > Best regards,
> 
> Had reviewed that DDN internally already. LGTM
> 
> Reviewed-by: Bernd Schubert <bernd@bsbernd.com>

Should I grab it?
Re: [PATCH] fuse: fix inode initialization race
Posted by Bernd Schubert 1 week, 4 days ago

On 3/26/26 15:26, Christian Brauner wrote:
> On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
>>
>>
>> On 3/18/26 14:43, Horst Birthelmer wrote:
>>> From: Horst Birthelmer <hbirthelmer@ddn.com>
>>>
>>> Fix a race between fuse_iget() and fuse_reverse_inval_inode() where
>>> invalidation can arrive while an inode is being initialized, causing
>>> the invalidation to be lost.
>>>
>>> Add a waitqueue to make fuse_reverse_inval_inode() wait when it
>>> encounters an inode with attr_version == 0 (still initializing).
>>> When fuse_change_attributes_common() completes initialization, it
>>> wakes waiting threads.
>>>
>>> This ensures invalidations are properly serialized with inode
>>> initialization, maintaining cache coherency.
>>>
>>> Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
>>> ---
>>>  fs/fuse/fuse_i.h | 3 +++
>>>  fs/fuse/inode.c  | 8 ++++++++
>>>  2 files changed, 11 insertions(+)
>>>
>>> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
>>> index 7f16049387d15e869db4be23a93605098588eda9..1be611472eee276371b3bde1a55257c1116cfedd 100644
>>> --- a/fs/fuse/fuse_i.h
>>> +++ b/fs/fuse/fuse_i.h
>>> @@ -945,6 +945,9 @@ struct fuse_conn {
>>>  	/** Version counter for attribute changes */
>>>  	atomic64_t attr_version;
>>>  
>>> +	/** Waitqueue for attr_version initialization */
>>> +	wait_queue_head_t attr_version_waitq;
>>> +
>>>  	/** Version counter for evict inode */
>>>  	atomic64_t evict_ctr;
>>>  
>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>>> index e57b8af06be93ecc29c58864a9c9e99c68e3283b..c6e7e50d80c0edaea57d9342869eaf811786e342 100644
>>> --- a/fs/fuse/inode.c
>>> +++ b/fs/fuse/inode.c
>>> @@ -246,6 +246,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
>>>  		set_mask_bits(&fi->inval_mask, STATX_BASIC_STATS, 0);
>>>  
>>>  	fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>> +	wake_up_all(&fc->attr_version_waitq);
>>>  	fi->i_time = attr_valid;


While I'm looking at this again, wouldn't it make sense to make this
conditional? Because we wake this queue on every attr change for every
inode. And the conditional in fuse_iget() based on I_NEW?



Thanks,
Bernd
Re: [PATCH] fuse: fix inode initialization race
Posted by Miklos Szeredi 1 week, 4 days ago
On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
>
>
>
> On 3/26/26 15:26, Christian Brauner wrote:
> > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> >>
> >>
> >> On 3/18/26 14:43, Horst Birthelmer wrote:
> >>> From: Horst Birthelmer <hbirthelmer@ddn.com>

> >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> >>> +   wake_up_all(&fc->attr_version_waitq);
> >>>     fi->i_time = attr_valid;
>
>
> While I'm looking at this again, wouldn't it make sense to make this
> conditional? Because we wake this queue on every attr change for every
> inode. And the conditional in fuse_iget() based on I_NEW?

Right, should only wake if fi->attr_version old value was zero.

BTW I have a hunch that there are better solutions, but it's simple
enough as a stopgap measure.

Thanks,
Miklos
Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Horst Birthelmer 1 week, 4 days ago
On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> >
> >
> >
> > On 3/26/26 15:26, Christian Brauner wrote:
> > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > >>
> > >>
> > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> 
> > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > >>> +   wake_up_all(&fc->attr_version_waitq);
> > >>>     fi->i_time = attr_valid;
> >
> >
> > While I'm looking at this again, wouldn't it make sense to make this
> > conditional? Because we wake this queue on every attr change for every
> > inode. And the conditional in fuse_iget() based on I_NEW?
> 
> Right, should only wake if fi->attr_version old value was zero.
> 
> BTW I have a hunch that there are better solutions, but it's simple
> enough as a stopgap measure.

OK, I'll send a new version.

Just out of curiosity, what would be a better solution?

> 
> Thanks,
> Miklos

Thanks,
Horst
Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Joanne Koong 1 week, 4 days ago
On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> > On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> > >
> > >
> > >
> > > On 3/26/26 15:26, Christian Brauner wrote:
> > > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > > >>
> > > >>
> > > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> >
> > > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > > >>> +   wake_up_all(&fc->attr_version_waitq);
> > > >>>     fi->i_time = attr_valid;
> > >
> > >
> > > While I'm looking at this again, wouldn't it make sense to make this
> > > conditional? Because we wake this queue on every attr change for every
> > > inode. And the conditional in fuse_iget() based on I_NEW?
> >
> > Right, should only wake if fi->attr_version old value was zero.
> >
> > BTW I have a hunch that there are better solutions, but it's simple
> > enough as a stopgap measure.
>
> OK, I'll send a new version.
>
> Just out of curiosity, what would be a better solution?

I'm probably missing something here but why can't we just call the

        fi = get_fuse_inode(inode);
        spin_lock(&fi->lock);
        fi->nlookup++;
        spin_unlock(&fi->lock);
        fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
                                 evict_ctr);

logic before releasing the inode lock (the unlock_new_inode() call) in
fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
after the attributes initialization has finished.

As I understand it, fuse_change_attributes_i() would be pretty
straightforward / fast for I_NEW inodes, as it doesn't send any
synchronous requests and for the I_NEW case the
invalidate_inode_pages2() and truncate_pagecache() calls would get
skipped. (truncate_pagecache() getting skipped because inode->i_size
is already attr->size from fuse_init_inode(), so "oldsize !=
attr->size" is never true; and invalidate_inode_pages2() getting
skipped because "oldsize != attr->size" is never true and "if
(!timespec64_equal(&old_mtime, &new_mtime))" is never true because
fuse_init_inode() initialized the inode's mtime to attr->mtime).

Thanks,
Joanne

>
> >
> > Thanks,
> > Miklos
>
> Thanks,
> Horst
>
Re: Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Horst Birthelmer 1 week, 3 days ago
On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
> On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >
> > On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> > > On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> > > >
> > > >
> > > >
> > > > On 3/26/26 15:26, Christian Brauner wrote:
> > > > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > > > >>
> > > > >>
> > > > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > > > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> > >
> > > > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > > > >>> +   wake_up_all(&fc->attr_version_waitq);
> > > > >>>     fi->i_time = attr_valid;
> > > >
> > > >
> > > > While I'm looking at this again, wouldn't it make sense to make this
> > > > conditional? Because we wake this queue on every attr change for every
> > > > inode. And the conditional in fuse_iget() based on I_NEW?
> > >
> > > Right, should only wake if fi->attr_version old value was zero.
> > >
> > > BTW I have a hunch that there are better solutions, but it's simple
> > > enough as a stopgap measure.
> >
> > OK, I'll send a new version.
> >
> > Just out of curiosity, what would be a better solution?
> 
> I'm probably missing something here but why can't we just call the
> 
>         fi = get_fuse_inode(inode);
>         spin_lock(&fi->lock);
>         fi->nlookup++;
>         spin_unlock(&fi->lock);
>         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
>                                  evict_ctr);
> 
> logic before releasing the inode lock (the unlock_new_inode() call) in
> fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
> fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
> after the attributes initialization has finished.
> 
> As I understand it, fuse_change_attributes_i() would be pretty
> straightforward / fast for I_NEW inodes, as it doesn't send any
> synchronous requests and for the I_NEW case the
> invalidate_inode_pages2() and truncate_pagecache() calls would get
> skipped. (truncate_pagecache() getting skipped because inode->i_size
> is already attr->size from fuse_init_inode(), so "oldsize !=
> attr->size" is never true; and invalidate_inode_pages2() getting
> skipped because "oldsize != attr->size" is never true and "if
> (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
> fuse_init_inode() initialized the inode's mtime to attr->mtime).

You understand the pretty well, I think.
The problem I have there is that fuse_change_attributes_i() takes
its own lock.
That would be a pretty big operation to split that function.

Is that required for this small (as Miklos put it, rare) case?

If you guys want me to do that, I will.

Thanks,
Horst
Re: Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Joanne Koong 1 week, 3 days ago
On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
> > On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > >
> > > On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> > > > On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 3/26/26 15:26, Christian Brauner wrote:
> > > > > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > > > > >>
> > > > > >>
> > > > > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > > > > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> > > >
> > > > > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > > > > >>> +   wake_up_all(&fc->attr_version_waitq);
> > > > > >>>     fi->i_time = attr_valid;
> > > > >
> > > > >
> > > > > While I'm looking at this again, wouldn't it make sense to make this
> > > > > conditional? Because we wake this queue on every attr change for every
> > > > > inode. And the conditional in fuse_iget() based on I_NEW?
> > > >
> > > > Right, should only wake if fi->attr_version old value was zero.
> > > >
> > > > BTW I have a hunch that there are better solutions, but it's simple
> > > > enough as a stopgap measure.
> > >
> > > OK, I'll send a new version.
> > >
> > > Just out of curiosity, what would be a better solution?
> >
> > I'm probably missing something here but why can't we just call the
> >
> >         fi = get_fuse_inode(inode);
> >         spin_lock(&fi->lock);
> >         fi->nlookup++;
> >         spin_unlock(&fi->lock);
> >         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
> >                                  evict_ctr);
> >
> > logic before releasing the inode lock (the unlock_new_inode() call) in
> > fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
> > fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
> > after the attributes initialization has finished.
> >
> > As I understand it, fuse_change_attributes_i() would be pretty
> > straightforward / fast for I_NEW inodes, as it doesn't send any
> > synchronous requests and for the I_NEW case the
> > invalidate_inode_pages2() and truncate_pagecache() calls would get
> > skipped. (truncate_pagecache() getting skipped because inode->i_size
> > is already attr->size from fuse_init_inode(), so "oldsize !=
> > attr->size" is never true; and invalidate_inode_pages2() getting
> > skipped because "oldsize != attr->size" is never true and "if
> > (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
> > fuse_init_inode() initialized the inode's mtime to attr->mtime).
>
> You understand the pretty well, I think.
> The problem I have there is that fuse_change_attributes_i() takes
> its own lock.
> That would be a pretty big operation to split that function.

I believe fuse_change_attribtues_i() takes the fi lock, not the inode
lock, so this should be fine.

Thanks,
Joanne
>
> Is that required for this small (as Miklos put it, rare) case?
>
> If you guys want me to do that, I will.
>
> Thanks,
> Horst
Re: [PATCH] fuse: fix inode initialization race
Posted by Bernd Schubert 1 week, 3 days ago

On 3/26/26 19:00, Joanne Koong wrote:
> On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>
>> On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
>>> On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>>>
>>>> On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
>>>>> On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/26/26 15:26, Christian Brauner wrote:
>>>>>>> On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/18/26 14:43, Horst Birthelmer wrote:
>>>>>>>>> From: Horst Birthelmer <hbirthelmer@ddn.com>
>>>>>
>>>>>>>>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>>>>>>>> +   wake_up_all(&fc->attr_version_waitq);
>>>>>>>>>     fi->i_time = attr_valid;
>>>>>>
>>>>>>
>>>>>> While I'm looking at this again, wouldn't it make sense to make this
>>>>>> conditional? Because we wake this queue on every attr change for every
>>>>>> inode. And the conditional in fuse_iget() based on I_NEW?
>>>>>
>>>>> Right, should only wake if fi->attr_version old value was zero.
>>>>>
>>>>> BTW I have a hunch that there are better solutions, but it's simple
>>>>> enough as a stopgap measure.
>>>>
>>>> OK, I'll send a new version.
>>>>
>>>> Just out of curiosity, what would be a better solution?
>>>
>>> I'm probably missing something here but why can't we just call the
>>>
>>>         fi = get_fuse_inode(inode);
>>>         spin_lock(&fi->lock);
>>>         fi->nlookup++;
>>>         spin_unlock(&fi->lock);
>>>         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
>>>                                  evict_ctr);
>>>
>>> logic before releasing the inode lock (the unlock_new_inode() call) in
>>> fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
>>> fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
>>> after the attributes initialization has finished.
>>>
>>> As I understand it, fuse_change_attributes_i() would be pretty
>>> straightforward / fast for I_NEW inodes, as it doesn't send any
>>> synchronous requests and for the I_NEW case the
>>> invalidate_inode_pages2() and truncate_pagecache() calls would get
>>> skipped. (truncate_pagecache() getting skipped because inode->i_size
>>> is already attr->size from fuse_init_inode(), so "oldsize !=
>>> attr->size" is never true; and invalidate_inode_pages2() getting
>>> skipped because "oldsize != attr->size" is never true and "if
>>> (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
>>> fuse_init_inode() initialized the inode's mtime to attr->mtime).
>>
>> You understand the pretty well, I think.
>> The problem I have there is that fuse_change_attributes_i() takes
>> its own lock.
>> That would be a pretty big operation to split that function.
> 
> I believe fuse_change_attribtues_i() takes the fi lock, not the inode
> lock, so this should be fine.
> 


Ah, you want to update the attributes before unlock_new_inode()? The
risk I see is that I don't imediately see all code paths of
truncate_pagecache and invalidate_inode_pages2. Even if there is no
issue right, who would easily notice the fuse behavior in the future.
I kind of agree with that method if the condition would be

		if (oldsize > attr->size) {
			truncate_pagecache(inode, attr->size);

and I don't understand why it is '!='. I.e. for a new inode oldsize
would be 0 - the condition would never trigger and my concern would
never be true.

Thanks,
Bernd

Re: [PATCH] fuse: fix inode initialization race
Posted by Joanne Koong 1 week, 3 days ago
On Thu, Mar 26, 2026 at 11:16 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>
>
> On 3/26/26 19:00, Joanne Koong wrote:
> > On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >>
> >> On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
> >>> On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >>>>
> >>>> On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> >>>>> On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 3/26/26 15:26, Christian Brauner wrote:
> >>>>>>> On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 3/18/26 14:43, Horst Birthelmer wrote:
> >>>>>>>>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> >>>>>
> >>>>>>>>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> >>>>>>>>> +   wake_up_all(&fc->attr_version_waitq);
> >>>>>>>>>     fi->i_time = attr_valid;
> >>>>>>
> >>>>>>
> >>>>>> While I'm looking at this again, wouldn't it make sense to make this
> >>>>>> conditional? Because we wake this queue on every attr change for every
> >>>>>> inode. And the conditional in fuse_iget() based on I_NEW?
> >>>>>
> >>>>> Right, should only wake if fi->attr_version old value was zero.
> >>>>>
> >>>>> BTW I have a hunch that there are better solutions, but it's simple
> >>>>> enough as a stopgap measure.
> >>>>
> >>>> OK, I'll send a new version.
> >>>>
> >>>> Just out of curiosity, what would be a better solution?
> >>>
> >>> I'm probably missing something here but why can't we just call the
> >>>
> >>>         fi = get_fuse_inode(inode);
> >>>         spin_lock(&fi->lock);
> >>>         fi->nlookup++;
> >>>         spin_unlock(&fi->lock);
> >>>         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
> >>>                                  evict_ctr);
> >>>
> >>> logic before releasing the inode lock (the unlock_new_inode() call) in
> >>> fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
> >>> fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
> >>> after the attributes initialization has finished.
> >>>
> >>> As I understand it, fuse_change_attributes_i() would be pretty
> >>> straightforward / fast for I_NEW inodes, as it doesn't send any
> >>> synchronous requests and for the I_NEW case the
> >>> invalidate_inode_pages2() and truncate_pagecache() calls would get
> >>> skipped. (truncate_pagecache() getting skipped because inode->i_size
> >>> is already attr->size from fuse_init_inode(), so "oldsize !=
> >>> attr->size" is never true; and invalidate_inode_pages2() getting
> >>> skipped because "oldsize != attr->size" is never true and "if
> >>> (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
> >>> fuse_init_inode() initialized the inode's mtime to attr->mtime).
> >>
> >> You understand the pretty well, I think.
> >> The problem I have there is that fuse_change_attributes_i() takes
> >> its own lock.
> >> That would be a pretty big operation to split that function.
> >
> > I believe fuse_change_attribtues_i() takes the fi lock, not the inode
> > lock, so this should be fine.
> >
>
>
> Ah, you want to update the attributes before unlock_new_inode()? The
> risk I see is that I don't imediately see all code paths of
> truncate_pagecache and invalidate_inode_pages2. Even if there is no

We never call into truncate_pagecache() or invalidate_inode_pages2()
from fuse_change_attributes_i():
 truncate_pagecache() is gated by oldsize != attr->size:
    for I_NEW inodes, fuse_init_inode() sets inode->i_size to
attr->size, which means "oldsize != attr->size" is alwyas going to be
false

  invalidate_inode_pages2() is gated by oldsize != attr->size and
"!timespec64_equal(&old_mtime, &new_mtime)" (where new_mtime  ses the
mtime values from attr):
    for I_NEW inodes, fuse_init_inode() calls inode_set_mtime(inode,
attr->mtime, attr->mtimensec);, which means
"!timespec64_equal(&old_mtime, &new_mtime)" is always going to be
false

> issue right, who would easily notice the fuse behavior in the future.
> I kind of agree with that method if the condition would be
>
>                 if (oldsize > attr->size) {
>                         truncate_pagecache(inode, attr->size);
>
> and I don't understand why it is '!='. I.e. for a new inode oldsize
> would be 0 - the condition would never trigger and my concern would

For a new inode, oldsize is attr->size, not 0. The condition never triggers.

fuse_iget() calls fuse_init_inode() (which sets inode->i_size to
attr->size) before it calls fuse_change_attributes_i().

Thanks,
Joanne

> never be true.
>
> Thanks,
> Bernd
>
Re: [PATCH] fuse: fix inode initialization race
Posted by Bernd Schubert 1 week, 3 days ago

On 3/26/26 20:00, Joanne Koong wrote:
> On Thu, Mar 26, 2026 at 11:16 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>>
>>
>> On 3/26/26 19:00, Joanne Koong wrote:
>>> On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>>>
>>>> On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
>>>>> On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>>>>>
>>>>>> On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
>>>>>>> On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/26/26 15:26, Christian Brauner wrote:
>>>>>>>>> On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/18/26 14:43, Horst Birthelmer wrote:
>>>>>>>>>>> From: Horst Birthelmer <hbirthelmer@ddn.com>
>>>>>>>
>>>>>>>>>>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>>>>>>>>>> +   wake_up_all(&fc->attr_version_waitq);
>>>>>>>>>>>     fi->i_time = attr_valid;
>>>>>>>>
>>>>>>>>
>>>>>>>> While I'm looking at this again, wouldn't it make sense to make this
>>>>>>>> conditional? Because we wake this queue on every attr change for every
>>>>>>>> inode. And the conditional in fuse_iget() based on I_NEW?
>>>>>>>
>>>>>>> Right, should only wake if fi->attr_version old value was zero.
>>>>>>>
>>>>>>> BTW I have a hunch that there are better solutions, but it's simple
>>>>>>> enough as a stopgap measure.
>>>>>>
>>>>>> OK, I'll send a new version.
>>>>>>
>>>>>> Just out of curiosity, what would be a better solution?
>>>>>
>>>>> I'm probably missing something here but why can't we just call the
>>>>>
>>>>>         fi = get_fuse_inode(inode);
>>>>>         spin_lock(&fi->lock);
>>>>>         fi->nlookup++;
>>>>>         spin_unlock(&fi->lock);
>>>>>         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
>>>>>                                  evict_ctr);
>>>>>
>>>>> logic before releasing the inode lock (the unlock_new_inode() call) in
>>>>> fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
>>>>> fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
>>>>> after the attributes initialization has finished.
>>>>>
>>>>> As I understand it, fuse_change_attributes_i() would be pretty
>>>>> straightforward / fast for I_NEW inodes, as it doesn't send any
>>>>> synchronous requests and for the I_NEW case the
>>>>> invalidate_inode_pages2() and truncate_pagecache() calls would get
>>>>> skipped. (truncate_pagecache() getting skipped because inode->i_size
>>>>> is already attr->size from fuse_init_inode(), so "oldsize !=
>>>>> attr->size" is never true; and invalidate_inode_pages2() getting
>>>>> skipped because "oldsize != attr->size" is never true and "if
>>>>> (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
>>>>> fuse_init_inode() initialized the inode's mtime to attr->mtime).
>>>>
>>>> You understand the pretty well, I think.
>>>> The problem I have there is that fuse_change_attributes_i() takes
>>>> its own lock.
>>>> That would be a pretty big operation to split that function.
>>>
>>> I believe fuse_change_attribtues_i() takes the fi lock, not the inode
>>> lock, so this should be fine.
>>>
>>
>>
>> Ah, you want to update the attributes before unlock_new_inode()? The
>> risk I see is that I don't imediately see all code paths of
>> truncate_pagecache and invalidate_inode_pages2. Even if there is no
> 
> We never call into truncate_pagecache() or invalidate_inode_pages2()
> from fuse_change_attributes_i():
>  truncate_pagecache() is gated by oldsize != attr->size:
>     for I_NEW inodes, fuse_init_inode() sets inode->i_size to
> attr->size, which means "oldsize != attr->size" is alwyas going to be
> false
> 
>   invalidate_inode_pages2() is gated by oldsize != attr->size and
> "!timespec64_equal(&old_mtime, &new_mtime)" (where new_mtime  ses the
> mtime values from attr):
>     for I_NEW inodes, fuse_init_inode() calls inode_set_mtime(inode,
> attr->mtime, attr->mtimensec);, which means
> "!timespec64_equal(&old_mtime, &new_mtime)" is always going to be
> false
> 
>> issue right, who would easily notice the fuse behavior in the future.
>> I kind of agree with that method if the condition would be
>>
>>                 if (oldsize > attr->size) {
>>                         truncate_pagecache(inode, attr->size);
>>
>> and I don't understand why it is '!='. I.e. for a new inode oldsize
>> would be 0 - the condition would never trigger and my concern would
> 
> For a new inode, oldsize is attr->size, not 0. The condition never triggers.
> 
> fuse_iget() calls fuse_init_inode() (which sets inode->i_size to
> attr->size) before it calls fuse_change_attributes_i().


Yeah, I just see it, had missed that before. Your patch version sounds
good to me then.


Thanks,
Bernd
Re: Re: Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Horst Birthelmer 1 week, 3 days ago
On Thu, Mar 26, 2026 at 11:00:54AM -0700, Joanne Koong wrote:
> On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >
> > On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
> > > On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > > >
> > > > On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> > > > > On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 3/26/26 15:26, Christian Brauner wrote:
> > > > > > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > > > > > >>
> > > > > > >>
> > > > > > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > > > > > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> > > > >
> > > > > > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > > > > > >>> +   wake_up_all(&fc->attr_version_waitq);
> > > > > > >>>     fi->i_time = attr_valid;
> > > > > >
> > > > > >
> > > > > > While I'm looking at this again, wouldn't it make sense to make this
> > > > > > conditional? Because we wake this queue on every attr change for every
> > > > > > inode. And the conditional in fuse_iget() based on I_NEW?
> > > > >
> > > > > Right, should only wake if fi->attr_version old value was zero.
> > > > >
> > > > > BTW I have a hunch that there are better solutions, but it's simple
> > > > > enough as a stopgap measure.
> > > >
> > > > OK, I'll send a new version.
> > > >
> > > > Just out of curiosity, what would be a better solution?
> > >
> > > I'm probably missing something here but why can't we just call the
> > >
> > >         fi = get_fuse_inode(inode);
> > >         spin_lock(&fi->lock);
> > >         fi->nlookup++;
> > >         spin_unlock(&fi->lock);
> > >         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
> > >                                  evict_ctr);
> > >
> > > logic before releasing the inode lock (the unlock_new_inode() call) in
> > > fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
> > > fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
> > > after the attributes initialization has finished.
> > >
> > > As I understand it, fuse_change_attributes_i() would be pretty
> > > straightforward / fast for I_NEW inodes, as it doesn't send any
> > > synchronous requests and for the I_NEW case the
> > > invalidate_inode_pages2() and truncate_pagecache() calls would get
> > > skipped. (truncate_pagecache() getting skipped because inode->i_size
> > > is already attr->size from fuse_init_inode(), so "oldsize !=
> > > attr->size" is never true; and invalidate_inode_pages2() getting
> > > skipped because "oldsize != attr->size" is never true and "if
> > > (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
> > > fuse_init_inode() initialized the inode's mtime to attr->mtime).
> >
> > You understand the pretty well, I think.
> > The problem I have there is that fuse_change_attributes_i() takes
> > its own lock.
> > That would be a pretty big operation to split that function.
> 
> I believe fuse_change_attribtues_i() takes the fi lock, not the inode
> lock, so this should be fine.
> 

Yes, I got confused there, sorry.
Still, a pretty big change for a corner case. Don't you think?

What would be the advantage to the current situation with the requested 
changes from Miklos and Bernd, of course? So that we only do the 
wakeup_all() call only if we have initialized a new inode?

> Thanks,
> Joanne
> >
> > Is that required for this small (as Miklos put it, rare) case?
> >
> > If you guys want me to do that, I will.
> >
> > Thanks,
> > Horst
Re: Re: Re: Re: [PATCH] fuse: fix inode initialization race
Posted by Joanne Koong 1 week, 3 days ago
On Thu, Mar 26, 2026 at 11:11 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Thu, Mar 26, 2026 at 11:00:54AM -0700, Joanne Koong wrote:
> > On Thu, Mar 26, 2026 at 10:54 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > >
> > > On Thu, Mar 26, 2026 at 09:43:00AM -0700, Joanne Koong wrote:
> > > > On Thu, Mar 26, 2026 at 8:48 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > > > >
> > > > > On Thu, Mar 26, 2026 at 04:19:24PM +0100, Miklos Szeredi wrote:
> > > > > > On Thu, 26 Mar 2026 at 16:13, Bernd Schubert <bernd@bsbernd.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On 3/26/26 15:26, Christian Brauner wrote:
> > > > > > > > On Wed, Mar 25, 2026 at 08:54:57AM +0100, Bernd Schubert wrote:
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On 3/18/26 14:43, Horst Birthelmer wrote:
> > > > > > > >>> From: Horst Birthelmer <hbirthelmer@ddn.com>
> > > > > >
> > > > > > > >>>     fi->attr_version = atomic64_inc_return(&fc->attr_version);
> > > > > > > >>> +   wake_up_all(&fc->attr_version_waitq);
> > > > > > > >>>     fi->i_time = attr_valid;
> > > > > > >
> > > > > > >
> > > > > > > While I'm looking at this again, wouldn't it make sense to make this
> > > > > > > conditional? Because we wake this queue on every attr change for every
> > > > > > > inode. And the conditional in fuse_iget() based on I_NEW?
> > > > > >
> > > > > > Right, should only wake if fi->attr_version old value was zero.
> > > > > >
> > > > > > BTW I have a hunch that there are better solutions, but it's simple
> > > > > > enough as a stopgap measure.
> > > > >
> > > > > OK, I'll send a new version.
> > > > >
> > > > > Just out of curiosity, what would be a better solution?
> > > >
> > > > I'm probably missing something here but why can't we just call the
> > > >
> > > >         fi = get_fuse_inode(inode);
> > > >         spin_lock(&fi->lock);
> > > >         fi->nlookup++;
> > > >         spin_unlock(&fi->lock);
> > > >         fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
> > > >                                  evict_ctr);
> > > >
> > > > logic before releasing the inode lock (the unlock_new_inode() call) in
> > > > fuse_iget() to address the race? unlock_new_inode() clears I_NEW so
> > > > fuse_reverse_inval_inode()'s fuse_ilookup() would only get the inode
> > > > after the attributes initialization has finished.
> > > >
> > > > As I understand it, fuse_change_attributes_i() would be pretty
> > > > straightforward / fast for I_NEW inodes, as it doesn't send any
> > > > synchronous requests and for the I_NEW case the
> > > > invalidate_inode_pages2() and truncate_pagecache() calls would get
> > > > skipped. (truncate_pagecache() getting skipped because inode->i_size
> > > > is already attr->size from fuse_init_inode(), so "oldsize !=
> > > > attr->size" is never true; and invalidate_inode_pages2() getting
> > > > skipped because "oldsize != attr->size" is never true and "if
> > > > (!timespec64_equal(&old_mtime, &new_mtime))" is never true because
> > > > fuse_init_inode() initialized the inode's mtime to attr->mtime).
> > >
> > > You understand the pretty well, I think.
> > > The problem I have there is that fuse_change_attributes_i() takes
> > > its own lock.
> > > That would be a pretty big operation to split that function.
> >
> > I believe fuse_change_attribtues_i() takes the fi lock, not the inode
> > lock, so this should be fine.
> >
>
> Yes, I got confused there, sorry.
> Still, a pretty big change for a corner case. Don't you think?

I don't think it's a big change. It would just be this:

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 84f78fb89d35..bd0ec405823e 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -470,6 +470,7 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
        struct inode *inode;
        struct fuse_inode *fi;
        struct fuse_conn *fc = get_fuse_conn_super(sb);
+       bool is_new_inode = false;

        /*
         * Auto mount points get their node id from the submount root, which is
@@ -505,13 +506,13 @@ struct inode *fuse_iget(struct super_block *sb,
u64 nodeid,
        if (!inode)
                return NULL;

-       if ((inode_state_read_once(inode) & I_NEW)) {
+       is_new_inode = inode_state_read_once(inode) & I_NEW;
+       if (is_new_inode) {
                inode->i_flags |= S_NOATIME;
                if (!fc->writeback_cache || !S_ISREG(attr->mode))
                        inode->i_flags |= S_NOCMTIME;
                inode->i_generation = generation;
                fuse_init_inode(inode, attr, fc);
-               unlock_new_inode(inode);
        } else if (fuse_stale_inode(inode, generation, attr)) {
                /* nodeid was reused, any I/O on the old inode should fail */
                fuse_make_bad(inode);
@@ -528,6 +529,9 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
 done:
        fuse_change_attributes_i(inode, attr, NULL, attr_valid, attr_version,
                                 evict_ctr);
+       if (is_new_inode)
+               unlock_new_inode(inode);
+
        return inode;
 }

which imo is pretty minimal.

>
> What would be the advantage to the current situation with the requested
> changes from Miklos and Bernd, of course? So that we only do the
> wakeup_all() call only if we have initialized a new inode?

I think this is the proper fix because it eliminates the race
altogether and it to me is a lot cleaner than trying to work around
the race with the waitqueue/wakeup stuff.

Thanks,
Joanne

>
> > Thanks,
> > Joanne
> > >
> > > Is that required for this small (as Miklos put it, rare) case?
> > >
> > > If you guys want me to do that, I will.
> > >
> > > Thanks,
> > > Horst