[PATCH v2 00/18] mm/ksw: Introduce real-time Kernel Stack Watch debugging tool

Jinchao Wang posted 18 patches 4 weeks, 1 day ago
There is a newer version of this series
MAINTAINERS                           |   6 +
arch/x86/include/asm/hw_breakpoint.h  |   1 +
arch/x86/kernel/hw_breakpoint.c       |  50 +++++
include/linux/hw_breakpoint.h         |   1 +
kernel/events/hw_breakpoint.c         |  18 ++
mm/Kconfig.debug                      |  20 ++
mm/Makefile                           |   1 +
mm/kstackwatch/Makefile               |   8 +
mm/kstackwatch/kernel.c               | 260 +++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h          |  53 +++++
mm/kstackwatch/kstackwatch_test.c     | 261 +++++++++++++++++++++++
mm/kstackwatch/stack.c                | 286 ++++++++++++++++++++++++++
mm/kstackwatch/watch.c                | 175 ++++++++++++++++
tools/kstackwatch/kstackwatch_test.sh | 118 +++++++++++
14 files changed, 1258 insertions(+)
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/kstackwatch_test.c
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/watch.c
create mode 100755 tools/kstackwatch/kstackwatch_test.sh
[PATCH v2 00/18] mm/ksw: Introduce real-time Kernel Stack Watch debugging tool
Posted by Jinchao Wang 4 weeks, 1 day ago
This patch series introduces **KStackWatch**, a lightweight kernel debugging tool
for detecting kernel stack corruption in real time.

The motivation comes from scenarios where corruption occurs silently in one function
but manifests later as a crash in another. Using KASAN may not reproduce the issue due
to its heavy overhead. with no direct call trace linking the two. Such bugs are often
extremely hard to debug with existing tools.
I demonstrate this scenario in **test2 (silent corruption test)**.

KStackWatch works by combining a hardware breakpoint with kprobe and fprobe.
It can watch a stack canary or a selected local variable and detects the moment the
corruption actually occurs. This allows developers to pinpoint the real source rather
than only observing the final crash.

Key features include:

  - Lightweight overhead with minimal impact on bug reproducibility
  - Real-time detection of stack corruption
  - Simple configuration through `/proc/kstackwatch`
  - Support for recursive depth filter

To validate the approach, the patch includes a test module and a test script.

---
V2 incorporates feedback and builds on the previously proposed RFC [1] and V1 [2].
The changes are as follows:
V2:
  * Make hardware breakpoint and stack operations architecture-independent.
V1:
Core Implementation

  *   Replaced kretprobe with fprobe for function exit hooking, as suggested
      by Masami Hiramatsu
  *   Introduced per-task depth logic to track recursion across scheduling
  *   Removed the use of workqueue for a more efficient corruption check
  *   Reordered patches for better logical flow
  *   Simplified and improved commit messages throughout the series
  *   Removed initial archcheck which should be improved later


Testing and Architecture

  *   Replaced the multiple-thread test with silent corruption test
  *   Split self-tests into a separate patch to improve clarity.

Maintenance
  *   Added a new entry for KStackWatch to the MAINTAINERS file.

[1] https://lore.kernel.org/lkml/20250818122720.434981-1-wangjinchao600@gmail.com/
[2] https://lore.kernel.org/all/20250828073311.1116593-1-wangjinchao600@gmail.com/
---

The series is structured as follows:

Jinchao Wang (18):
  mm/ksw: add build system support
  mm/ksw: add ksw_config struct and parser
  mm/ksw: add /proc/kstackwatch interface
  mm/ksw: add HWBP pre-allocation support
  x86/hw_breakpoint: introduce arch_reinstall_hw_breakpoint() for atomic
    context
  perf/hw_breakpoint: add arch-independent hw_breakpoint_modify_local()
  mm/ksw: add atomic watch on/off operations
  mm/ksw: add stack probe support
  mm/ksw: implement stack canary and local var resolution logic
  mm/ksw: add per-task recursion depth tracking
  mm/ksw: coordinate watch and stack for full functionality
  mm/ksw: add self-debug functions for kstackwatch watch
  mm/ksw: add test module
  mm/ksw: add stack overflow test
  mm/ksw: add simplified silent corruption test
  mm/ksw: add recursive corruption test
  tools/kstackwatch: add interactive test script for KStackWatch
  MAINTAINERS: add entry for KStackWatch (Kernel Stack Watch)

 MAINTAINERS                           |   6 +
 arch/x86/include/asm/hw_breakpoint.h  |   1 +
 arch/x86/kernel/hw_breakpoint.c       |  50 +++++
 include/linux/hw_breakpoint.h         |   1 +
 kernel/events/hw_breakpoint.c         |  18 ++
 mm/Kconfig.debug                      |  20 ++
 mm/Makefile                           |   1 +
 mm/kstackwatch/Makefile               |   8 +
 mm/kstackwatch/kernel.c               | 260 +++++++++++++++++++++++
 mm/kstackwatch/kstackwatch.h          |  53 +++++
 mm/kstackwatch/kstackwatch_test.c     | 261 +++++++++++++++++++++++
 mm/kstackwatch/stack.c                | 286 ++++++++++++++++++++++++++
 mm/kstackwatch/watch.c                | 175 ++++++++++++++++
 tools/kstackwatch/kstackwatch_test.sh | 118 +++++++++++
 14 files changed, 1258 insertions(+)
 create mode 100644 mm/kstackwatch/Makefile
 create mode 100644 mm/kstackwatch/kernel.c
 create mode 100644 mm/kstackwatch/kstackwatch.h
 create mode 100644 mm/kstackwatch/kstackwatch_test.c
 create mode 100644 mm/kstackwatch/stack.c
 create mode 100644 mm/kstackwatch/watch.c
 create mode 100755 tools/kstackwatch/kstackwatch_test.sh

-- 
2.43.0