From nobody Sun Feb 8 18:15:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34821C77B7A for ; Tue, 16 May 2023 14:04:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233771AbjEPOEE (ORCPT ); Tue, 16 May 2023 10:04:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233373AbjEPOEA (ORCPT ); Tue, 16 May 2023 10:04:00 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 043855587 for ; Tue, 16 May 2023 07:03:58 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 445E721D62; Tue, 16 May 2023 14:03:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1684245837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vt8ryzJLAJAkyYIx+u2ECoH5bKy4mjk0v0F629AfpSI=; b=ccKvfKittKkN6HxHLGL+QpSjw8rFAf8Ux7eVB2jtADA5UWHlHnNY+DcJ/DUxmfIabaQc28 QRbvwfKylYNVGR70zJ60nFOLf+se5GHFth39+OTWqJ/qCc9rLjzBHtccZlbAN4Em88Ovav xMS8SDgr0hBSDI8mtyLBKU3HjuNyibs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1684245837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vt8ryzJLAJAkyYIx+u2ECoH5bKy4mjk0v0F629AfpSI=; b=2al6RYQtoKbv33PDjcGSAWxdTzLcomXAEwnqFj+pB1LCx3w8fezoD2ANY7NzJ5C7rInHjm xZnvz+NLm3jM6YCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7DCE5138F5; Tue, 16 May 2023 14:03:56 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id kLudG0yNY2SeWwAAMHmgww (envelope-from ); Tue, 16 May 2023 14:03:56 +0000 From: Oscar Salvador To: Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Vlastimil Babka , Eric Dumazet , Waiman Long , Suren Baghdasaryan , Marco Elver , Andrey Konovalov , Alexander Potapenko , Oscar Salvador Subject: [PATCH v5 1/3] lib/stackdepot: Add a refcount field in stack_record Date: Tue, 16 May 2023 16:03:31 +0200 Message-Id: <20230516140333.3776-2-osalvador@suse.de> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230516140333.3776-1-osalvador@suse.de> References: <20230516140333.3776-1-osalvador@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We want to filter out page_owner output and print only those stacks that have been repeated beyond a certain threshold. This gives us the chance to get rid of a lot of noise. In order to do that, we need to keep track of how many repeated stacks (for allocation) do we have, so we add a new refcount_t field in the stack_record struct. Note that this might increase the size of the struct for some architectures. E.g: x86_64 is not affected due to alignment, but x86 32bits might. The alternative would be to have some kind of struct like this: struct track_stacks { struct stack_record *stack; struct track_stacks *next; refcount_t stack_count; But ithat would imply to perform more allocations and glue everything together, which would make the code more complex, so I think that going with a new field in the struct stack_record is good enough. Note that on __set_page_owner_handle(), page_owner->handle is set, and on __reset_page_owner(), page_owner->free_handle is set. We are interested in page_owner->handle, so when __set_page_owner() gets called, we derive the stack_record struct from page_owner->handle, and we increment its refcount_t field; and when __reset_page_owner() gets called, we derive its stack_record from page_owner->handle() and we decrement its refcount_t field. Signed-off-by: Oscar Salvador --- include/linux/stackdepot.h | 2 ++ lib/stackdepot.c | 53 +++++++++++++++++++++++++++++++------- mm/page_owner.c | 6 +++++ 3 files changed, 51 insertions(+), 10 deletions(-) diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h index e58306783d8e..6ba4fcdb0c5f 100644 --- a/include/linux/stackdepot.h +++ b/include/linux/stackdepot.h @@ -94,6 +94,8 @@ static inline int stack_depot_early_init(void) { return 0= ; } depot_stack_handle_t __stack_depot_save(unsigned long *entries, unsigned int nr_entries, gfp_t gfp_flags, bool can_alloc); +void stack_depot_inc_count(depot_stack_handle_t handle); +void stack_depot_dec_count(depot_stack_handle_t handle); =20 /** * stack_depot_save - Save a stack trace to stack depot diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 2f5aa851834e..bc4a9cd25834 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -60,6 +60,7 @@ struct stack_record { u32 hash; /* Hash in the hash table */ u32 size; /* Number of stored frames */ union handle_parts handle; + refcount_t count; /* Number of the same repeated stacks */ unsigned long entries[]; /* Variable-sized array of frames */ }; =20 @@ -305,6 +306,7 @@ depot_alloc_stack(unsigned long *entries, int size, u32= hash, void **prealloc) stack->handle.offset =3D pool_offset >> DEPOT_STACK_ALIGN; stack->handle.valid =3D 1; stack->handle.extra =3D 0; + refcount_set(&stack->count, 1); memcpy(stack->entries, entries, flex_array_size(stack, entries, size)); pool_offset +=3D required_size; /* @@ -457,8 +459,7 @@ depot_stack_handle_t stack_depot_save(unsigned long *en= tries, } EXPORT_SYMBOL_GPL(stack_depot_save); =20 -unsigned int stack_depot_fetch(depot_stack_handle_t handle, - unsigned long **entries) +static struct stack_record *stack_depot_getstack(depot_stack_handle_t hand= le) { union handle_parts parts =3D { .handle =3D handle }; /* @@ -470,6 +471,26 @@ unsigned int stack_depot_fetch(depot_stack_handle_t ha= ndle, size_t offset =3D parts.offset << DEPOT_STACK_ALIGN; struct stack_record *stack; =20 + if (!handle) + return NULL; + + if (parts.pool_index > pool_index_cached) { + WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n", + parts.pool_index, pool_index_cached, handle); + return NULL; + } + pool =3D stack_pools[parts.pool_index]; + if (!pool) + return NULL; + stack =3D pool + offset; + return stack; +} + +unsigned int stack_depot_fetch(depot_stack_handle_t handle, + unsigned long **entries) +{ + struct stack_record *stack; + *entries =3D NULL; /* * Let KMSAN know *entries is initialized. This shall prevent false @@ -480,21 +501,33 @@ unsigned int stack_depot_fetch(depot_stack_handle_t h= andle, if (!handle) return 0; =20 - if (parts.pool_index > pool_index_cached) { - WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n", - parts.pool_index, pool_index_cached, handle); - return 0; - } - pool =3D stack_pools[parts.pool_index]; - if (!pool) + stack =3D stack_depot_getstack(handle); + if (!stack) return 0; - stack =3D pool + offset; =20 *entries =3D stack->entries; return stack->size; } EXPORT_SYMBOL_GPL(stack_depot_fetch); =20 +void stack_depot_inc_count(depot_stack_handle_t handle) +{ + struct stack_record *stack =3D NULL; + + stack =3D stack_depot_getstack(handle); + if (stack) + refcount_inc(&stack->count); +} + +void stack_depot_dec_count(depot_stack_handle_t handle) +{ + struct stack_record *stack =3D NULL; + + stack =3D stack_depot_getstack(handle); + if (stack) + refcount_dec(&stack->count); +} + void stack_depot_print(depot_stack_handle_t stack) { unsigned long *entries; diff --git a/mm/page_owner.c b/mm/page_owner.c index 31169b3e7f06..2d5d07013e4e 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -139,6 +139,7 @@ void __reset_page_owner(struct page *page, unsigned sho= rt order) int i; struct page_ext *page_ext; depot_stack_handle_t handle; + depot_stack_handle_t alloc_handle; struct page_owner *page_owner; u64 free_ts_nsec =3D local_clock(); =20 @@ -146,6 +147,9 @@ void __reset_page_owner(struct page *page, unsigned sho= rt order) if (unlikely(!page_ext)) return; =20 + page_owner =3D get_page_owner(page_ext); + alloc_handle =3D page_owner->handle; + handle =3D save_stack(GFP_NOWAIT | __GFP_NOWARN); for (i =3D 0; i < (1 << order); i++) { __clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags); @@ -155,6 +159,7 @@ void __reset_page_owner(struct page *page, unsigned sho= rt order) page_ext =3D page_ext_next(page_ext); } page_ext_put(page_ext); + stack_depot_dec_count(alloc_handle); } =20 static inline void __set_page_owner_handle(struct page_ext *page_ext, @@ -196,6 +201,7 @@ noinline void __set_page_owner(struct page *page, unsig= ned short order, return; __set_page_owner_handle(page_ext, handle, order, gfp_mask); page_ext_put(page_ext); + stack_depot_inc_count(handle); } =20 void __set_page_owner_migrate_reason(struct page *page, int reason) --=20 2.35.3