[PATCH RFC 0/1] lockdep: Raise default STACK_TRACE_BITS when KASAN is enabled

Mikhail Gavrilov posted 1 patch 3 weeks, 3 days ago
[PATCH RFC 0/1] lockdep: Raise default STACK_TRACE_BITS when KASAN is enabled
Posted by Mikhail Gavrilov 3 weeks, 3 days ago
Hi,
 
I keep hitting "BUG: MAX_STACK_TRACE_ENTRIES too low!" on a desktop
workstation (AMD Zen4, RX 7900 XTX, btrfs) running debug kernels with
KASAN + LOCKDEP + PREEMPT_FULL.  The lockdep validator dies within
9-23 hours of normal desktop use (GNOME, gaming under Wine/Proton).
 
I originally reported this in September 2025 [1] but at the time only
had the BUG splat and lock_stat.  Since then I've collected three more
reproductions with full lockdep_stats and identified the root cause.
 
[1] https://lore.kernel.org/all/CABXGCsMa-BUWx4Xwc0nBKvSwPdX4HJXf=kZZOndYmNQNELx02g@mail.gmail.com/
 
This is not a synthetic workload — it is an ordinary desktop session.
Three reproducible crashes on 7.0-rc2 and 7.0-rc3:
 
  dmesg-1: BUG at 81423s (22.6h), gnome-shell, drm_mode_closefb_ioctl
  dmesg-2: BUG at 34316s  (9.5h), kworker/btrfs-endio-write
  dmesg-3: BUG at 52360s (14.5h), showtime (Proton), amdgpu_gem_create_ioctl

All dmesg logs, lock_stat and lockdep_stats files are at:
  https://gist.github.com/NTMan/a778999ed3bf11b128ee97fb3083fb6b
 
Config: CONFIG_LOCKDEP_CHAINS_BITS=19, CONFIG_LOCKDEP_STACK_TRACE_BITS=19
(the default), CONFIG_KASAN=y, PREEMPT_FULL.
 
 
lockdep_stats at the moment of overflow
----------------------------------------
 
/proc/lockdep_stats captured after the BUG (dmesg-3):
 
  stack-trace entries:         524288 [max: 524288]   ← the ONLY exhausted resource
  number of stack traces:       22080                 ← unique traces after dedup
  number of stack hash chains:  12131
  dependency chains:           164665 [max: 524288]   ← only 31% used
  direct dependencies:          45270 [max:  65536]   ← 69%
  lock-classes:                  2811 [max:   8192]   ← 34%
  dependency chain hlocks used:              1186508 [max: 2621440]  ← 45%
 
Stack-trace entries is the sole bottleneck.  22080 unique stack traces
consumed the entire 524288-entry buffer — an average of ~24 frames per
trace, which is typical for deep call stacks through amdgpu + btrfs +
Wine/Proton + KASAN instrumentation.  The hash-based deduplication
(commit 12593b7467f9) is working — these 22080 traces are genuinely
unique, not duplicates.
 
 
KASAN <-> lockdep feedback loop
-------------------------------
 
The stack trace from dmesg-3 reveals a pathological interaction between
KASAN slab tracking and lockdep under PREEMPT_FULL:
 
  showtime (Proton game)
   → amdgpu_gem_create_ioctl
    → drm_buddy_alloc_blocks
     → kmem_cache_alloc_noprof
      → __kasan_slab_alloc          ← KASAN tracks every slab alloc
       → kasan_save_stack
        → stack_trace_save
         → arch_stack_walk
          → is_bpf_text_address     ← stack unwinder checks BPF
           → __rcu_read_unlock
            → rcu_preempt_deferred_qs_irqrestore  ← PREEMPT_FULL
             → swake_up_one
              → lock_acquire
               → __lock_acquire
                → validate_chain
                 → check_prev_add
                  → save_trace()    ← BUG: buffer full
 
Every KASAN-tracked slab allocation captures a stack trace.  The stack
unwinder calls is_bpf_text_address(), which takes an RCU read lock.
Under PREEMPT_FULL, __rcu_read_unlock() can trigger deferred
quiescent-state processing, which calls swake_up_one(), which takes a
spinlock, which triggers lockdep validation → save_trace().
 
This creates an amplification effect: KASAN's own stack-trace capture
indirectly generates new lockdep dependency chains, consuming the buffer
from both sides.
 
 
Why "just bump the numbers" keeps coming back
----------------------------------------------
 
This BUG has been addressed by increasing limits since 2009:
 
  2009  d80c19df5fcc    Ingo: bump MAX_LOCKDEP_ENTRIES/CHAINS
  2014  Sasha Levin     Double all limits
  2019  12593b7467f9    Bart Van Assche: hash-based dedup in save_trace()
  2021  5dc33592e955    Tetsuo Handa: Kconfig knobs (current API)
  2024  Intel i915-rt   Local hack: STACK_TRACE_BITS=22 for CI
 
The dedup from 2019 helps, but it cannot collapse genuinely unique
traces — and KASAN + PREEMPT_FULL + deep GPU/filesystem stacks produce
exactly that: 22080 unique traces in 14.5 hours.
 
The attached patch raises LOCKDEP_STACK_TRACE_BITS default from 19 to
21 (and hash table from 14 to 16) when CONFIG_KASAN=y.  Cost: +12MB,
negligible for a kernel already spending gigabytes on KASAN shadow.
 
But I believe the right long-term fix is what Peter suggested in 2019
[2]: migrating lockdep's stack trace storage to stackdepot (or a
growable variant).  The KASAN feedback loop above demonstrates that the
fixed-size stack_trace buffer is fundamentally the wrong approach when
KASAN is enabled — KASAN generates unique traces faster than a static
buffer can hold them.
 
Since 2019, stackdepot has gained GFP_NOWAIT support, pre-allocation
via stack_depot_early_init(), and a lockless read path (Andrey
Konovalov's 2023-2024 series), which should address the original
concerns about calling it under graph_lock.
 
Peter, would a patch implementing growable stack trace storage — either
via stackdepot or a simpler page-allocation scheme under graph_lock
with GFP_NOWAIT — be welcome?  I'm willing to work on this if there
is agreement on the approach.
 
[2] https://lore.kernel.org/lkml/20190710220931.GH3402@hirez.programming.kicks-ass.net/
 
Thanks,
Mikhail

-- 
2.53.0