[v1] fs/namespace: notify pollers of legacy propagation changes

[PATCH] fs/namespace: notify pollers of legacy propagation changes

Posted by Guopeng Zhang 1 week, 3 days ago

From: Guopeng Zhang <zhangguopeng@kylinos.cn>

Changing mount propagation through the legacy mount API changes
user-visible mountinfo contents, including the shared: and master:
optional fields.

The mount_setattr() path already touches the mount namespace after
change_mnt_propagation(), so pollers of /proc/<pid>/mountinfo are woken
when the namespace event changes.

The legacy mount --make-* path also changes propagation through
change_mnt_propagation(), and MOVE_MOUNT_SET_GROUP updates the
propagation relationship of the target mount. Both paths currently
return without touching the affected mount namespace.

As a result, userspace polling /proc/<pid>/mountinfo can miss these
propagation-only changes even though mountinfo has changed.

A simple reproducer that polls /proc/self/mountinfo while changing
propagation shows the inconsistency.

Before this change:

  legacy MS_SHARED: poll ret=0 revents=0x0
  mount_setattr MS_SHARED: poll ret=1 revents=0xa

After this change:

  legacy MS_SHARED: poll ret=1 revents=0xa
  mount_setattr MS_SHARED: poll ret=1 revents=0xa

Touch the affected mount namespace after successfully changing
propagation state in do_change_type() and do_set_group(). Take the
vfsmount lock for write around touch_mnt_namespace(), as required by
its locking rules.

Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
---
 fs/namespace.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 9a66a806a9b8..f871c7bf3bc8 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2908,6 +2908,10 @@ static int do_change_type(const struct path *path, int ms_flags)
 	for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL))
 		change_mnt_propagation(m, type);
 
+	lock_mount_hash();
+	touch_mnt_namespace(mnt->mnt_ns);
+	unlock_mount_hash();
+
 	return 0;
 }
 
@@ -3479,6 +3483,11 @@ static int do_set_group(const struct path *from_path, const struct path *to_path
 		list_add(&to->mnt_share, &from->mnt_share);
 		set_mnt_shared(to);
 	}
+
+	lock_mount_hash();
+	touch_mnt_namespace(to->mnt_ns);
+	unlock_mount_hash();
+
 	return 0;
 }
 
-- 
2.43.0

Re: [PATCH] fs/namespace: notify pollers of legacy propagation changes

Posted by Christian Brauner 1 week, 3 days ago

On Fri, May 29, 2026 at 05:54:41PM +0800, Guopeng Zhang wrote:
> From: Guopeng Zhang <zhangguopeng@kylinos.cn>
> 
> Changing mount propagation through the legacy mount API changes
> user-visible mountinfo contents, including the shared: and master:
> optional fields.
> 
> The mount_setattr() path already touches the mount namespace after
> change_mnt_propagation(), so pollers of /proc/<pid>/mountinfo are woken
> when the namespace event changes.
> 
> The legacy mount --make-* path also changes propagation through
> change_mnt_propagation(), and MOVE_MOUNT_SET_GROUP updates the
> propagation relationship of the target mount. Both paths currently
> return without touching the affected mount namespace.
> 
> As a result, userspace polling /proc/<pid>/mountinfo can miss these
> propagation-only changes even though mountinfo has changed.
> 
> A simple reproducer that polls /proc/self/mountinfo while changing
> propagation shows the inconsistency.
> 
> Before this change:
> 
>   legacy MS_SHARED: poll ret=0 revents=0x0
>   mount_setattr MS_SHARED: poll ret=1 revents=0xa
> 
> After this change:
> 
>   legacy MS_SHARED: poll ret=1 revents=0xa
>   mount_setattr MS_SHARED: poll ret=1 revents=0xa
> 
> Touch the affected mount namespace after successfully changing
> propagation state in do_change_type() and do_set_group(). Take the
> vfsmount lock for write around touch_mnt_namespace(), as required by
> its locking rules.
> 
> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
> ---
>  fs/namespace.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 9a66a806a9b8..f871c7bf3bc8 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2908,6 +2908,10 @@ static int do_change_type(const struct path *path, int ms_flags)
>  	for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL))
>  		change_mnt_propagation(m, type);
>  
> +	lock_mount_hash();
> +	touch_mnt_namespace(mnt->mnt_ns);
> +	unlock_mount_hash();
> +
>  	return 0;
>  }
>  
> @@ -3479,6 +3483,11 @@ static int do_set_group(const struct path *from_path, const struct path *to_path
>  		list_add(&to->mnt_share, &from->mnt_share);
>  		set_mnt_shared(to);
>  	}
> +
> +	lock_mount_hash();
> +	touch_mnt_namespace(to->mnt_ns);
> +	unlock_mount_hash();

Doing this would cause seqcount readers to retry on mount propagation
changes when all of them really only care about mount topology changes.
So this can likely use:

guard(mount_locked_reader)();
touch_mnt_namespace(mnt_ns);

Even today, observing an unchanged seqcount across mnt->mnt_flags reads
doesn't guarantee that it really wasn't changed.

Re: [PATCH] fs/namespace: notify pollers of legacy propagation changes

Posted by Guopeng Zhang 1 week ago


在 2026/5/29 18:23, Christian Brauner 写道:
> On Fri, May 29, 2026 at 05:54:41PM +0800, Guopeng Zhang wrote:
>> From: Guopeng Zhang <zhangguopeng@kylinos.cn>
>>
>> Changing mount propagation through the legacy mount API changes
>> user-visible mountinfo contents, including the shared: and master:
>> optional fields.
>>
>> The mount_setattr() path already touches the mount namespace after
>> change_mnt_propagation(), so pollers of /proc/<pid>/mountinfo are woken
>> when the namespace event changes.
>>
>> The legacy mount --make-* path also changes propagation through
>> change_mnt_propagation(), and MOVE_MOUNT_SET_GROUP updates the
>> propagation relationship of the target mount. Both paths currently
>> return without touching the affected mount namespace.
>>
>> As a result, userspace polling /proc/<pid>/mountinfo can miss these
>> propagation-only changes even though mountinfo has changed.
>>
>> A simple reproducer that polls /proc/self/mountinfo while changing
>> propagation shows the inconsistency.
>>
>> Before this change:
>>
>>   legacy MS_SHARED: poll ret=0 revents=0x0
>>   mount_setattr MS_SHARED: poll ret=1 revents=0xa
>>
>> After this change:
>>
>>   legacy MS_SHARED: poll ret=1 revents=0xa
>>   mount_setattr MS_SHARED: poll ret=1 revents=0xa
>>
>> Touch the affected mount namespace after successfully changing
>> propagation state in do_change_type() and do_set_group(). Take the
>> vfsmount lock for write around touch_mnt_namespace(), as required by
>> its locking rules.
>>
>> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
>> ---
>>  fs/namespace.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/fs/namespace.c b/fs/namespace.c
>> index 9a66a806a9b8..f871c7bf3bc8 100644
>> --- a/fs/namespace.c
>> +++ b/fs/namespace.c
>> @@ -2908,6 +2908,10 @@ static int do_change_type(const struct path *path, int ms_flags)
>>  	for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL))
>>  		change_mnt_propagation(m, type);
>>  
>> +	lock_mount_hash();
>> +	touch_mnt_namespace(mnt->mnt_ns);
>> +	unlock_mount_hash();
>> +
>>  	return 0;
>>  }
>>  
>> @@ -3479,6 +3483,11 @@ static int do_set_group(const struct path *from_path, const struct path *to_path
>>  		list_add(&to->mnt_share, &from->mnt_share);
>>  		set_mnt_shared(to);
>>  	}
>> +
>> +	lock_mount_hash();
>> +	touch_mnt_namespace(to->mnt_ns);
>> +	unlock_mount_hash();
> 
> Doing this would cause seqcount readers to retry on mount propagation
> changes when all of them really only care about mount topology changes.
> So this can likely use:
> 
> guard(mount_locked_reader)();
> touch_mnt_namespace(mnt_ns);
> 
> Even today, observing an unchanged seqcount across mnt->mnt_flags reads
> doesn't guarantee that it really wasn't changed.
Hi Christian,

Thanks for the review and explanation.

I will send a v2 with the suggested changes.

Thanks,
Guopeng