From nobody Tue Oct 7 21:52:06 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71014239E9C; Fri, 9 May 2025 16:51:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746809499; cv=none; b=f+4irKaevO3f/4QWzYtMf0MuOAdvOUEu4+0GC1ImyieW0YTOKUCcj2rWKW4kxGUN5cZOcbKQn5pQEewM5BDz78v0VPwdA+UfqzO8ucaqsgs8aOgUaTcVtZ1oOAdpTvAsVJtB7LNUF2Lmuoy5Y6boxH6C6kiUX2N5CpC3qPLyZFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746809499; c=relaxed/simple; bh=ufMG0Oyzm5psynbpq4hu4JKQIoDt2Fur92v2iDYLSVI=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=pdHB1iloRPV2zgzo1BqFW6CxXnLy8Fr5f6Zf022qrjXOsLEl0E6jsjC+KBOG/90L8kCCW2FAVmBEOW7cKtCTxMNMI037d2Wc41RFZZSC5m2PqKXx9DOBEwVOn+BSX4UIi1CizOu1pYa5HS6RVCpyIjTS1rA4r8VBMHV+bmt52qI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 473F0C4CEF3; Fri, 9 May 2025 16:51:39 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uDQx1-00000002gHD-0RKe; Fri, 09 May 2025 12:51:55 -0400 Message-ID: <20250509165154.958107663@goodmis.org> User-Agent: quilt/0.68 Date: Fri, 09 May 2025 12:45:32 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org, x86@kernel.org Cc: Masami Hiramatsu , Mathieu Desnoyers , Josh Poimboeuf , Peter Zijlstra , Ingo Molnar , Jiri Olsa , Namhyung Kim Subject: [PATCH v8 08/18] unwind_user/deferred: Add unwind cache References: <20250509164524.448387100@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Cache the results of the unwind to ensure the unwind is only performed once, even when called by multiple tracers. The cache nr_entries gets cleared every time the task exits the kernel. When a stacktrace is requested, nr_entries gets set to the number of entries in the stacktrace. If another stacktrace is requested, if nr_entries is not zero, then it contains the same stacktrace that would be retrieved so it is not processed again and the entries is given to the caller. Co-developed-by: Steven Rostedt (Google) Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- Changes since v7: https://lore.kernel.org/20250502165008.734340489@goodmis.= org - Allocate unwind_cache as a structure and not just its entries (Ingo Molnar) - Fixed white space issues (Ingo Molnar) include/linux/entry-common.h | 2 ++ include/linux/unwind_deferred.h | 8 ++++++++ include/linux/unwind_deferred_types.h | 7 ++++++- kernel/unwind/deferred.c | 26 ++++++++++++++++++++------ 4 files changed, 36 insertions(+), 7 deletions(-) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index f94f3fdf15fc..6e850c9d3f0c 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -12,6 +12,7 @@ #include #include #include +#include =20 #include #include @@ -362,6 +363,7 @@ static __always_inline void exit_to_user_mode(void) lockdep_hardirqs_on_prepare(); instrumentation_end(); =20 + unwind_exit_to_user_mode(); user_enter_irqoff(); arch_exit_to_user_mode(); lockdep_hardirqs_on(CALLER_ADDR0); diff --git a/include/linux/unwind_deferred.h b/include/linux/unwind_deferre= d.h index 5064ebe38c4f..7d6cb2ffd084 100644 --- a/include/linux/unwind_deferred.h +++ b/include/linux/unwind_deferred.h @@ -12,6 +12,12 @@ void unwind_task_free(struct task_struct *task); =20 int unwind_deferred_trace(struct unwind_stacktrace *trace); =20 +static __always_inline void unwind_exit_to_user_mode(void) +{ + if (unlikely(current->unwind_info.cache)) + current->unwind_info.cache->nr_entries =3D 0; +} + #else /* !CONFIG_UNWIND_USER */ =20 static inline void unwind_task_init(struct task_struct *task) {} @@ -19,6 +25,8 @@ static inline void unwind_task_free(struct task_struct *t= ask) {} =20 static inline int unwind_deferred_trace(struct unwind_stacktrace *trace) {= return -ENOSYS; } =20 +static inline void unwind_exit_to_user_mode(void) {} + #endif /* !CONFIG_UNWIND_USER */ =20 #endif /* _LINUX_UNWIND_USER_DEFERRED_H */ diff --git a/include/linux/unwind_deferred_types.h b/include/linux/unwind_d= eferred_types.h index aa32db574e43..db5b54b18828 100644 --- a/include/linux/unwind_deferred_types.h +++ b/include/linux/unwind_deferred_types.h @@ -2,8 +2,13 @@ #ifndef _LINUX_UNWIND_USER_DEFERRED_TYPES_H #define _LINUX_UNWIND_USER_DEFERRED_TYPES_H =20 +struct unwind_cache { + unsigned int nr_entries; + unsigned long entries[]; +}; + struct unwind_task_info { - unsigned long *entries; + struct unwind_cache *cache; }; =20 #endif /* _LINUX_UNWIND_USER_DEFERRED_TYPES_H */ diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c index 0bafb95e6336..e3913781c8c6 100644 --- a/kernel/unwind/deferred.c +++ b/kernel/unwind/deferred.c @@ -24,6 +24,7 @@ int unwind_deferred_trace(struct unwind_stacktrace *trace) { struct unwind_task_info *info =3D ¤t->unwind_info; + struct unwind_cache *cache; =20 /* Should always be called from faultable context */ might_fault(); @@ -31,17 +32,30 @@ int unwind_deferred_trace(struct unwind_stacktrace *tra= ce) if (current->flags & PF_EXITING) return -EINVAL; =20 - if (!info->entries) { - info->entries =3D kmalloc_array(UNWIND_MAX_ENTRIES, sizeof(long), - GFP_KERNEL); - if (!info->entries) + if (!info->cache) { + info->cache =3D kzalloc(struct_size(cache, entries, UNWIND_MAX_ENTRIES), + GFP_KERNEL); + if (!info->cache) return -ENOMEM; } =20 + cache =3D info->cache; + trace->entries =3D cache->entries; + + if (cache->nr_entries) { + /* + * The user stack has already been previously unwound in this + * entry context. Skip the unwind and use the cache. + */ + trace->nr =3D cache->nr_entries; + return 0; + } + trace->nr =3D 0; - trace->entries =3D info->entries; unwind_user(trace, UNWIND_MAX_ENTRIES); =20 + cache->nr_entries =3D trace->nr; + return 0; } =20 @@ -56,5 +70,5 @@ void unwind_task_free(struct task_struct *task) { struct unwind_task_info *info =3D &task->unwind_info; =20 - kfree(info->entries); + kfree(info->cache); } --=20 2.47.2