[PATCH v2 0/5] mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'

Mauricio Faria de Oliveira posted 5 patches an hour ago
Documentation/mm/page_owner.rst | 32 ++++++++++++++++-
mm/page_owner.c                 | 61 +++++++++++++++++++++++++++------
2 files changed, 81 insertions(+), 12 deletions(-)
[PATCH v2 0/5] mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'
Posted by Mauricio Faria de Oliveira an hour ago
Changelog:
 - v2:
   - Make context/usecase and hashing in userspace more explicit
     and add an example in the cover letter. (Michal Hocko)
   - Use different output files. (Michal Hocko, Vlastimil Babka)
   - Add context struct for plumbing flags from debugfs_create_file()
     down to stack_print(), avoiding more struct {file,seq}_operations
     for the new files.
   - Simplify commit messages.

 - v1:
   https://lore.kernel.org/linux-mm/20250924174023.261125-1-mfo@igalia.com/

Context:

The page_owner debug feature can help understand a particular situation in
in a point in time (e.g., identify biggest memory consumers; verify memory
counters that do not add up).

Another useful usecase is to collect data repeatedly over time, and use it
for profiling, monitoring, and even comparing different kernel versions,
at the stack trace level (e.g., watch for trends, leaks, correlations, and
regressions).

For this usecase, userspace periorically collects the data from page_owner
and organizes it in data structures appropriate for access per-stack trace.

Problem:

The usecase of tracking memory usage per stack trace (or tracking it for
a particular stack trace) requires uniquely identifying each stack trace
(i.e., keys to store their memory usage over periodic data collections).

This has to be done for every stack trace in every sample/data collection,
even if tracking only one stack trace (to identify it among all others).

Therefore, an approach like hashing the stack traces in userspace to create
unique keys/identifiers for them during post-processing can quickly become
expensive, considering the repetition and a growing number of stack traces.

Solution:

Fortunately, the kernel can provide a unique identifier for stack traces
in page_owner, which is the handle number in stackdepot. This eliminates
the need for creating keys (hashing) in userspace during post-processing.

Additionally, with that information, the stack traces themselves are not
needed until the memory usage should be resolved from a handle to a stack
trace (say, to look at the stack traces of a few top consumers). This can
reduce the amount of text emitted/copied by the kernel to userspace, and
save userspace from matching and discarding stack traces when not needed.

Changes:

This patchset adds 2 files to provide information, like 'show_stacks':
 - show_handles: print handle number and number of pages (no stack traces)
 - show_stacks_handles: print handle numbers and stack traces (no pages)

Now, it's possible to periodically collect data with handle numbers (keys)
and without stack traces (lower overhead) from 'show_handles', and later do
a final collection with handles and stack traces from 'show_stacks_handles'
to resolve the handles to their stack traces.

The output format follows the existing 'show_stacks' file, for simplicity,
but it can certainly be changed if a different format is more convenient.

Example:

The number of base pages collected can be stored per-handle number over the
periodic data collections, and finally resolved to stack traces per-handle
number as well with a final collection.

Later, one can, for example, identify the biggest consumers and watch their
trends or correlate increases/decreases with other events in the system,
or watch a particular stack trace(s) of interest during development.

Testing:

Tested on next-20250929.

 - show_stacks:
	
	 register_dummy_stack+0x32/0x70
	 init_page_owner+0x29/0x2f0
	 page_ext_init+0x27c/0x2b0
	 mm_core_init+0xdc/0x110
	nr_base_pages: 47
		
 - show_handles:

	handle: 1
	nr_base_pages: 47

 - show_stacks_handles:

	 register_dummy_stack+0x32/0x70
	 init_page_owner+0x29/0x2f0
	 page_ext_init+0x27c/0x2b0
	 mm_core_init+0xdc/0x110
	handle: 1
	
 - count_threshold:

	# echo 100 >/sys/kernel/debug/page_owner_stacks/count_threshold
	# grep register_dummy_stack show_stacks		# not present
	# grep -B4 '^handle: 1$' show_handles  		# not present 
	# grep -B4 '^handle: 1$' show_stacks_handles	# present
	 register_dummy_stack+0x32/0x70
	 init_page_owner+0x29/0x2f0
	 page_ext_init+0x27c/0x2b0
	 mm_core_init+0xdc/0x110
	handle: 1

Mauricio Faria de Oliveira (5):
  mm/page_owner: introduce struct stack_print_ctx
  mm/page_owner: add struct stack_print_ctx.flags
  mm/page_owner: add debugfs file 'show_handles'
  mm/page_owner: add debugfs file 'show_stacks_handles'
  mm/page_owner: update Documentation with 'show_handles' and
    'show_stacks_handles'

 Documentation/mm/page_owner.rst | 32 ++++++++++++++++-
 mm/page_owner.c                 | 61 +++++++++++++++++++++++++++------
 2 files changed, 81 insertions(+), 12 deletions(-)

-- 
2.48.1