From: Jinliang Zheng <alexjlzheng@tencent.com>
The kernfs implementation has big lock granularity(kernfs_idr_lock) so
every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
This patch switches the global kernfs_idr_lock to per-fs lock, which
put the spinlock into kernfs_root.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/kernfs/dir.c | 14 +++++++-------
fs/kernfs/kernfs-internal.h | 1 +
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index fc70d72c3fe8..355d943ffe27 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -27,7 +27,6 @@ DEFINE_RWLOCK(kernfs_rename_lock); /* kn->parent and ->name */
*/
static DEFINE_SPINLOCK(kernfs_pr_cont_lock);
static char kernfs_pr_cont_buf[PATH_MAX]; /* protected by pr_cont_lock */
-static DEFINE_SPINLOCK(kernfs_idr_lock); /* root->ino_idr */
#define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
@@ -584,9 +583,9 @@ void kernfs_put(struct kernfs_node *kn)
if (kernfs_type(kn) == KERNFS_LINK)
kernfs_put(kn->symlink.target_kn);
- spin_lock(&kernfs_idr_lock);
+ spin_lock(&root->kernfs_idr_lock);
idr_remove(&root->ino_idr, (u32)kernfs_ino(kn));
- spin_unlock(&kernfs_idr_lock);
+ spin_unlock(&root->kernfs_idr_lock);
call_rcu(&kn->rcu, kernfs_free_rcu);
@@ -639,13 +638,13 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
goto err_out1;
idr_preload(GFP_KERNEL);
- spin_lock(&kernfs_idr_lock);
+ spin_lock(&root->kernfs_idr_lock);
ret = idr_alloc_cyclic(&root->ino_idr, kn, 1, 0, GFP_ATOMIC);
if (ret >= 0 && ret < root->last_id_lowbits)
root->id_highbits++;
id_highbits = root->id_highbits;
root->last_id_lowbits = ret;
- spin_unlock(&kernfs_idr_lock);
+ spin_unlock(&root->kernfs_idr_lock);
idr_preload_end();
if (ret < 0)
goto err_out2;
@@ -681,9 +680,9 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
return kn;
err_out3:
- spin_lock(&kernfs_idr_lock);
+ spin_lock(&root->kernfs_idr_lock);
idr_remove(&root->ino_idr, (u32)kernfs_ino(kn));
- spin_unlock(&kernfs_idr_lock);
+ spin_unlock(&root->kernfs_idr_lock);
err_out2:
kmem_cache_free(kernfs_node_cache, kn);
err_out1:
@@ -989,6 +988,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
return ERR_PTR(-ENOMEM);
idr_init(&root->ino_idr);
+ spin_lock_init(&root->kernfs_idr_lock);
init_rwsem(&root->kernfs_rwsem);
init_rwsem(&root->kernfs_iattr_rwsem);
init_rwsem(&root->kernfs_supers_rwsem);
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 40a2a9cd819d..24e9514565ac 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -40,6 +40,7 @@ struct kernfs_root {
/* private fields, do not use outside kernfs proper */
struct idr ino_idr;
+ spinlock_t kernfs_idr_lock; /* root->ino_idr */
u32 last_id_lowbits;
u32 id_highbits;
struct kernfs_syscall_ops *syscall_ops;
--
2.48.1
On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > From: Jinliang Zheng <alexjlzheng@tencent.com> > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > This patch switches the global kernfs_idr_lock to per-fs lock, which > put the spinlock into kernfs_root. > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> Given that it doesn't really make things any more complicated, I think this makes more sense than the existing code even without any direct evidence that this improves performance. Acked-by: Tejun Heo <tj@kernel.org> Thanks. -- tejun
On Mon, 14 Apr 2025 07:09:28 -1000, tj@kernel.org wrote: > On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > put the spinlock into kernfs_root. > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > Given that it doesn't really make things any more complicated, I think this > makes more sense than the existing code even without any direct evidence > that this improves performance. I agree. thanks, Jinliang Zheng :) > > Acked-by: Tejun Heo <tj@kernel.org> > > Thanks. > > -- > tejun
On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > From: Jinliang Zheng <alexjlzheng@tencent.com> > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > This patch switches the global kernfs_idr_lock to per-fs lock, which > put the spinlock into kernfs_root. > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > --- > fs/kernfs/dir.c | 14 +++++++------- > fs/kernfs/kernfs-internal.h | 1 + > 2 files changed, 8 insertions(+), 7 deletions(-) What kind of testing / benchmark did you do for this series that shows that this works, AND that this actually is measureable? What workload are you doing that causes these changes to be needed? thanks, greg k-h
On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote: > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > put the spinlock into kernfs_root. > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > --- > > fs/kernfs/dir.c | 14 +++++++------- > > fs/kernfs/kernfs-internal.h | 1 + > > 2 files changed, 8 insertions(+), 7 deletions(-) > > What kind of testing / benchmark did you do for this series that shows > that this works, AND that this actually is measureable? What workload > are you doing that causes these changes to be needed? Thank you for your reply. :) We are trying to implement a kernfs-based filesystem that will have multiple instances running at the same time, i.e., multiple kernfs_roots. While investigating the kernfs implementation, we found some global locks that would cause noticeable lock contention when there are many filesystem instances. Fortunately, we found that some optimizations have been made in [1], which moved kernfs_rwsem into kernfs_root. But there are still some global locks left. We think it is also necessary to switch the remaining global locks to per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]: "... this is the right thing to do even if there is no concrete performance argument (not saying there isn't). It's just weird to entangle these completely unrelated users in a single rwsem." We think kernfs will be widely used to build other filesystems, so we strongly recommend switching global locks to per-fs. Thank you, Jinliang Zheng :) [1] https://lore.kernel.org/all/YZbbxK1F7jY%2FRBFF@slm.duckdns.org/ > > thanks, > > greg k-h
On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote: > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote: > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > > put the spinlock into kernfs_root. > > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > > --- > > > fs/kernfs/dir.c | 14 +++++++------- > > > fs/kernfs/kernfs-internal.h | 1 + > > > 2 files changed, 8 insertions(+), 7 deletions(-) > > > > What kind of testing / benchmark did you do for this series that shows > > that this works, AND that this actually is measureable? What workload > > are you doing that causes these changes to be needed? > > Thank you for your reply. :) > > We are trying to implement a kernfs-based filesystem that will have > multiple instances running at the same time, i.e., multiple kernfs_roots. I don't think that kernfs is meant for that very well, what is that filesystem going to be for? > While investigating the kernfs implementation, we found some global locks > that would cause noticeable lock contention when there are many filesystem > instances. > > Fortunately, we found that some optimizations have been made in [1], which > moved kernfs_rwsem into kernfs_root. But there are still some global locks > left. > > We think it is also necessary to switch the remaining global locks to > per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]: > > "... this is the right thing to do even if there is no concrete > performance argument (not saying there isn't). It's just weird to > entangle these completely unrelated users in a single rwsem." > > We think kernfs will be widely used to build other filesystems, so we > strongly recommend switching global locks to per-fs. I don't strongly object, but I would like to see some real-world numbers first. thanks, greg k-h
> On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote: > > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote: > > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > > > put the spinlock into kernfs_root. > > > > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > > > --- > > > > fs/kernfs/dir.c | 14 +++++++------- > > > > fs/kernfs/kernfs-internal.h | 1 + > > > > 2 files changed, 8 insertions(+), 7 deletions(-) > > > > > > What kind of testing / benchmark did you do for this series that shows > > > that this works, AND that this actually is measureable? What workload > > > are you doing that causes these changes to be needed? > > > > Thank you for your reply. :) > > > > We are trying to implement a kernfs-based filesystem that will have > > multiple instances running at the same time, i.e., multiple kernfs_roots. > > I don't think that kernfs is meant for that very well, what is that > filesystem going to be for? Thank you for your reply. :) Similar to cgroupfs and sysfs, it is used to export the status and configurations of some kernel variables in hierarchical modes of the kernel. The only difference is that it may have many instances, that is, many kernfs_roots. > > > While investigating the kernfs implementation, we found some global locks > > that would cause noticeable lock contention when there are many filesystem > > instances. > > > > Fortunately, we found that some optimizations have been made in [1], which > > moved kernfs_rwsem into kernfs_root. But there are still some global locks > > left. > > > > We think it is also necessary to switch the remaining global locks to > > per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]: > > > > "... this is the right thing to do even if there is no concrete > > performance argument (not saying there isn't). It's just weird to > > entangle these completely unrelated users in a single rwsem." > > > > We think kernfs will be widely used to build other filesystems, so we > > strongly recommend switching global locks to per-fs. > > I don't strongly object, but I would like to see some real-world numbers first. Haha, we are still evaluating whether to implement it based on kernfs, so it has not been implemented and tested yet. However, for these global locks, I think it is more reasonable to switch to per-fs, regardless of whether there is a significant performance improvement (in fact, when the number of kernfs-based filesystem instances increases, the lock contention will definitely be reduced. At least, it will not get worse, hahaha). Haha, but if you think it is not necessary to do this, just ignore this patchset. Thank you, Jinliang Zheng. :) > > thanks, > > greg k-h
On Mon, Apr 14, 2025 at 11:20:54AM +0800, Jinliang Zheng wrote: > > On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote: > > > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote: > > > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > > > > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > > > > put the spinlock into kernfs_root. > > > > > > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > --- > > > > > fs/kernfs/dir.c | 14 +++++++------- > > > > > fs/kernfs/kernfs-internal.h | 1 + > > > > > 2 files changed, 8 insertions(+), 7 deletions(-) > > > > > > > > What kind of testing / benchmark did you do for this series that shows > > > > that this works, AND that this actually is measureable? What workload > > > > are you doing that causes these changes to be needed? > > > > > > Thank you for your reply. :) > > > > > > We are trying to implement a kernfs-based filesystem that will have > > > multiple instances running at the same time, i.e., multiple kernfs_roots. > > > > I don't think that kernfs is meant for that very well, what is that > > filesystem going to be for? > > Thank you for your reply. :) > > Similar to cgroupfs and sysfs, it is used to export the status and configurations > of some kernel variables in hierarchical modes of the kernel. The only difference > is that it may have many instances, that is, many kernfs_roots. Let's see that filesystem first please before determining more, as you would be adding a new user/kernel api that we all need to argue about :) Anyway, for the 2 patches that Tejun agrees with here, can you resend just them? thanks, greg k-h
On Tue, 15 Apr 2025 17:16:46 +0200, gregkh@linuxfoundation.org wrote: > On Mon, Apr 14, 2025 at 11:20:54AM +0800, Jinliang Zheng wrote: > > > On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote: > > > > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote: > > > > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote: > > > > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > > > > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so > > > > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. > > > > > > > > > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which > > > > > > put the spinlock into kernfs_root. > > > > > > > > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > --- > > > > > > fs/kernfs/dir.c | 14 +++++++------- > > > > > > fs/kernfs/kernfs-internal.h | 1 + > > > > > > 2 files changed, 8 insertions(+), 7 deletions(-) > > > > > > > > > > What kind of testing / benchmark did you do for this series that shows > > > > > that this works, AND that this actually is measureable? What workload > > > > > are you doing that causes these changes to be needed? > > > > > > > > Thank you for your reply. :) > > > > > > > > We are trying to implement a kernfs-based filesystem that will have > > > > multiple instances running at the same time, i.e., multiple kernfs_roots. > > > > > > I don't think that kernfs is meant for that very well, what is that > > > filesystem going to be for? > > > > Thank you for your reply. :) > > > > Similar to cgroupfs and sysfs, it is used to export the status and configurations > > of some kernel variables in hierarchical modes of the kernel. The only difference > > is that it may have many instances, that is, many kernfs_roots. > > Let's see that filesystem first please before determining more, as you > would be adding a new user/kernel api that we all need to argue about :) > > Anyway, for the 2 patches that Tejun agrees with here, can you resend > just them? Fine. I have resent them at https://lore.kernel.org/all/20250415153659.14950-1-alexjlzheng@tencent.com/ thanks, Jinliang Zheng. :) > > thanks, > > greg k-h
© 2016 - 2026 Red Hat, Inc.