[PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access

Song Liu posted 4 patches 3 months, 3 weeks ago
[PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by Song Liu 3 months, 3 weeks ago
Existing kernfs_xattr_get() locks iattr_mutex, so it cannot be used in
RCU critical sections. Introduce __kernfs_xattr_get(), which reads xattr
under RCU read lock. This can be used by BPF programs to access cgroupfs
xattrs.

Signed-off-by: Song Liu <song@kernel.org>
---
 fs/kernfs/inode.c      | 14 ++++++++++++++
 include/linux/kernfs.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index b83054da68b3..0ca231d2012c 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -302,6 +302,20 @@ int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
 	return simple_xattr_get(&attrs->xattrs, name, value, size);
 }
 
+int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
+		       void *value, size_t size)
+{
+	struct kernfs_iattrs *attrs;
+
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	attrs = rcu_dereference(kn->iattr);
+	if (!attrs)
+		return -ENODATA;
+
+	return simple_xattr_get(&attrs->xattrs, name, value, size);
+}
+
 int kernfs_xattr_set(struct kernfs_node *kn, const char *name,
 		     const void *value, size_t size, int flags)
 {
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index b5a5f32fdfd1..8536ffc5c9f1 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -456,6 +456,8 @@ void kernfs_notify(struct kernfs_node *kn);
 
 int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
 		     void *value, size_t size);
+int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
+		       void *value, size_t size);
 int kernfs_xattr_set(struct kernfs_node *kn, const char *name,
 		     const void *value, size_t size, int flags);
 
-- 
2.47.1
Re: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by kernel test robot 3 months, 3 weeks ago
Hi Song,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Song-Liu/kernfs-Add-__kernfs_xattr_get-for-RCU-protected-access/20250619-074026
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20250618233739.189106-2-song%40kernel.org
patch subject: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
config: m68k-randconfig-r122-20250619 (https://download.01.org/0day-ci/archive/20250619/202506192154.T111naKp-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 8.5.0
reproduce: (https://download.01.org/0day-ci/archive/20250619/202506192154.T111naKp-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506192154.T111naKp-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> fs/kernfs/inode.c:312:17: sparse: sparse: incompatible types in comparison expression (different address spaces):
   fs/kernfs/inode.c:312:17: sparse:    struct kernfs_iattrs [noderef] __rcu *
   fs/kernfs/inode.c:312:17: sparse:    struct kernfs_iattrs *

vim +312 fs/kernfs/inode.c

   304	
   305	int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
   306			       void *value, size_t size)
   307	{
   308		struct kernfs_iattrs *attrs;
   309	
   310		WARN_ON_ONCE(!rcu_read_lock_held());
   311	
 > 312		attrs = rcu_dereference(kn->iattr);
   313		if (!attrs)
   314			return -ENODATA;
   315	
   316		return simple_xattr_get(&attrs->xattrs, name, value, size);
   317	}
   318	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by Christian Brauner 3 months, 3 weeks ago
On Wed, Jun 18, 2025 at 04:37:36PM -0700, Song Liu wrote:
> Existing kernfs_xattr_get() locks iattr_mutex, so it cannot be used in
> RCU critical sections. Introduce __kernfs_xattr_get(), which reads xattr
> under RCU read lock. This can be used by BPF programs to access cgroupfs
> xattrs.
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  fs/kernfs/inode.c      | 14 ++++++++++++++
>  include/linux/kernfs.h |  2 ++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
> index b83054da68b3..0ca231d2012c 100644
> --- a/fs/kernfs/inode.c
> +++ b/fs/kernfs/inode.c
> @@ -302,6 +302,20 @@ int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
>  	return simple_xattr_get(&attrs->xattrs, name, value, size);
>  }
>  
> +int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
> +		       void *value, size_t size)
> +{
> +	struct kernfs_iattrs *attrs;
> +
> +	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	attrs = rcu_dereference(kn->iattr);
> +	if (!attrs)
> +		return -ENODATA;

Hm, that looks a bit silly. Which isn't your fault. I'm looking at the
kernfs code that does the xattr allocations and I think that's the
origin of the silliness. It uses a single global mutex for all kernfs
users thus serializing all allocations for kernfs->iattr. That seems
crazy but maybe I'm missing a good reason.

I'm appending a patch to remove that mutex. @Greg, @Tejun, can you take
a look whether that makes sense to you. Then I can take that patch and
you can build yours on top of the series and I'll pick it all up in one
go.

You should then just use READ_ONCE(kn->iattr) or the
kernfs_iattrs_noalloc(kn) helper in your kfunc.
Re: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by Tejun Heo 3 months, 2 weeks ago
On Thu, Jun 19, 2025 at 12:01:19PM +0200, Christian Brauner wrote:
> From bdc53435a1cd5c456dc28d8239eff0e7fa4e8dda Mon Sep 17 00:00:00 2001
> From: Christian Brauner <brauner@kernel.org>
> Date: Thu, 19 Jun 2025 11:50:26 +0200
> Subject: [PATCH] kernfs: remove iattr_mutex
> 
> All allocations of struct kernfs_iattrs are serialized through a global
> mutex. Simply do a racy allocation and let the first one win. I bet most
> callers are under inode->i_rwsem anyway and it wouldn't be needed but
> let's not require that.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun
Re: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by Song Liu 3 months, 3 weeks ago

> On Jun 19, 2025, at 3:01 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> On Wed, Jun 18, 2025 at 04:37:36PM -0700, Song Liu wrote:
>> Existing kernfs_xattr_get() locks iattr_mutex, so it cannot be used in
>> RCU critical sections. Introduce __kernfs_xattr_get(), which reads xattr
>> under RCU read lock. This can be used by BPF programs to access cgroupfs
>> xattrs.
>> 
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>> fs/kernfs/inode.c      | 14 ++++++++++++++
>> include/linux/kernfs.h |  2 ++
>> 2 files changed, 16 insertions(+)
>> 
>> diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
>> index b83054da68b3..0ca231d2012c 100644
>> --- a/fs/kernfs/inode.c
>> +++ b/fs/kernfs/inode.c
>> @@ -302,6 +302,20 @@ int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
>> return simple_xattr_get(&attrs->xattrs, name, value, size);
>> }
>> 
>> +int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
>> +       void *value, size_t size)
>> +{
>> + struct kernfs_iattrs *attrs;
>> +
>> + WARN_ON_ONCE(!rcu_read_lock_held());
>> +
>> + attrs = rcu_dereference(kn->iattr);
>> + if (!attrs)
>> + return -ENODATA;
> 
> Hm, that looks a bit silly. Which isn't your fault. I'm looking at the
> kernfs code that does the xattr allocations and I think that's the
> origin of the silliness. It uses a single global mutex for all kernfs
> users thus serializing all allocations for kernfs->iattr. That seems
> crazy but maybe I'm missing a good reason.
> 
> I'm appending a patch to remove that mutex. @Greg, @Tejun, can you take
> a look whether that makes sense to you. Then I can take that patch and
> you can build yours on top of the series and I'll pick it all up in one
> go.
> 
> You should then just use READ_ONCE(kn->iattr) or the
> kernfs_iattrs_noalloc(kn) helper in your kfunc.
> <0001-kernfs-remove-iattr_mutex.patch>

This looks better indeed. 

I will drop 1/4 and include this patch. 

Thanks,
Song

Re: [PATCH bpf-next 1/4] kernfs: Add __kernfs_xattr_get for RCU protected access
Posted by Greg Kroah-Hartman 3 months, 3 weeks ago
On Thu, Jun 19, 2025 at 12:01:19PM +0200, Christian Brauner wrote:
> On Wed, Jun 18, 2025 at 04:37:36PM -0700, Song Liu wrote:
> > Existing kernfs_xattr_get() locks iattr_mutex, so it cannot be used in
> > RCU critical sections. Introduce __kernfs_xattr_get(), which reads xattr
> > under RCU read lock. This can be used by BPF programs to access cgroupfs
> > xattrs.
> > 
> > Signed-off-by: Song Liu <song@kernel.org>
> > ---
> >  fs/kernfs/inode.c      | 14 ++++++++++++++
> >  include/linux/kernfs.h |  2 ++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
> > index b83054da68b3..0ca231d2012c 100644
> > --- a/fs/kernfs/inode.c
> > +++ b/fs/kernfs/inode.c
> > @@ -302,6 +302,20 @@ int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
> >  	return simple_xattr_get(&attrs->xattrs, name, value, size);
> >  }
> >  
> > +int __kernfs_xattr_get(struct kernfs_node *kn, const char *name,
> > +		       void *value, size_t size)
> > +{
> > +	struct kernfs_iattrs *attrs;
> > +
> > +	WARN_ON_ONCE(!rcu_read_lock_held());
> > +
> > +	attrs = rcu_dereference(kn->iattr);
> > +	if (!attrs)
> > +		return -ENODATA;
> 
> Hm, that looks a bit silly. Which isn't your fault. I'm looking at the
> kernfs code that does the xattr allocations and I think that's the
> origin of the silliness. It uses a single global mutex for all kernfs
> users thus serializing all allocations for kernfs->iattr. That seems
> crazy but maybe I'm missing a good reason.
> 
> I'm appending a patch to remove that mutex. @Greg, @Tejun, can you take
> a look whether that makes sense to you. Then I can take that patch and
> you can build yours on top of the series and I'll pick it all up in one
> go.

Looks sane to me, thanks!

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>