mm/slab_common.c | 2 ++ mm/slub.c | 1 + 2 files changed, 3 insertions(+)
flush_rcu_sheaves_on_cache() calls queue_work_on() in a
for_each_online_cpu() loop, which requires the cpu to stay online.
But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the
set of "online cpus" is subject to change.
There are two paths that call flush_rcu_sheaves_on_cache():
// has cpus_read_lock()
flush_all_rcu_sheaves()
-> flush_rcu_sheaves_on_cache()
// no cpus_read_lock()
kvfree_rcu_barrier_on_cache()
-> flush_rcu_sheaves_on_cache()
Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().
Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock
order (slab_mutex -> cpu_hotplug_lock). The reverse order
(cpu_hotplug_lock -> slab_mutex) is established by
- cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
- kmem_cache_destroy()
The two orders together would form an AB-BA deadlock.
Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
to catch the same problem in the future.
Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
Changes in v2:
- Deleted the unnecessary comment.
- Added "Fixes" field in the commit message.
Changes in v3:
- Deleted the unnecessary comment.
mm/slab_common.c | 2 ++
mm/slub.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..8b661fff5eed 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -2110,7 +2110,9 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
{
if (cache_has_sheaves(s)) {
+ cpus_read_lock();
flush_rcu_sheaves_on_cache(s);
+ cpus_read_unlock();
rcu_barrier();
}
diff --git a/mm/slub.c b/mm/slub.c
index 161079ac5ba1..2a005d1e3a74 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
struct slub_flush_work *sfw;
unsigned int cpu;
+ lockdep_assert_cpus_held();
mutex_lock(&flush_lock);
for_each_online_cpu(cpu) {
--
2.34.1
On 5/12/26 05:50, Qing Wang wrote:
> flush_rcu_sheaves_on_cache() calls queue_work_on() in a
> for_each_online_cpu() loop, which requires the cpu to stay online.
> But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the
> set of "online cpus" is subject to change.
>
> There are two paths that call flush_rcu_sheaves_on_cache():
>
> // has cpus_read_lock()
> flush_all_rcu_sheaves()
> -> flush_rcu_sheaves_on_cache()
>
> // no cpus_read_lock()
> kvfree_rcu_barrier_on_cache()
> -> flush_rcu_sheaves_on_cache()
>
> Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().
>
> Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
> flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock
> order (slab_mutex -> cpu_hotplug_lock). The reverse order
> (cpu_hotplug_lock -> slab_mutex) is established by
>
> - cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
> - kmem_cache_destroy()
>
> The two orders together would form an AB-BA deadlock.
>
> Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
> to catch the same problem in the future.
>
> Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
> Signed-off-by: Qing Wang <wangqing7171@gmail.com>
Added Cc: stable as Harry suggested. Applied to slab/for-next-fixes. Thanks!
> ---
> Changes in v2:
> - Deleted the unnecessary comment.
> - Added "Fixes" field in the commit message.
> Changes in v3:
> - Deleted the unnecessary comment.
>
> mm/slab_common.c | 2 ++
> mm/slub.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d5a70a831a2a..8b661fff5eed 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -2110,7 +2110,9 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
> void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
> {
> if (cache_has_sheaves(s)) {
> + cpus_read_lock();
> flush_rcu_sheaves_on_cache(s);
> + cpus_read_unlock();
> rcu_barrier();
> }
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 161079ac5ba1..2a005d1e3a74 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
> struct slub_flush_work *sfw;
> unsigned int cpu;
>
> + lockdep_assert_cpus_held();
> mutex_lock(&flush_lock);
>
> for_each_online_cpu(cpu) {
© 2016 - 2026 Red Hat, Inc.