From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF737276038; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=eDxvkeF1a3LIDeUdPwJLqAXCu1eUqfg48Z291XZvyiUZM5fPoiDQJv02NwIGt2ovqTPNobA4JNtajfnaPEgaECEiU1R+QY+VSyJlhzLRNs1lPoFsI48rBpZCdVDJSWvmQNc+g/FH1+EeCGAxv1bnBlJzeKEv8YOCrNgHayfVD+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=A0XeP2rXDqyWwFrjwqotxM0fEsM6+5Akovg8OTwGKFU=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=GyeJ1a/rwRE1CrQdYNMGKee/OHYyq0ZOS8LUurLx+wwZrP77ApkwEFLxmwVQPvrYUvKZifeKY4D4Cq8lgLn2TqnppsmcRp0/g0VeMRg5Q6VAJPMYFJ9sV6bIl03ydF9UbB0bsUa0INaNWc9S/Rdq/+uF4rKUMEy3CVFR4NBUz1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80C08C4CEEC; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcB-00000001dNY-3NgN; Wed, 30 Apr 2025 16:01:07 -0400 Message-ID: <20250430200107.657571145@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:47 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 01/18] unwind_user: Add user space unwinding API References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Introduce a generic API for unwinding user stacks. In order to expand user space unwinding to be able to handle more complex scenarios, such as deferred unwinding and reading user space information, create a generic interface that all architectures can use that support the various unwinding methods. This is an alternative method for handling user space stack traces from the simple stack_trace_save_user() API. This does not replace that interface, but this interface will be used to expand the functionality of user space stack walking. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- Changes from v6: https://lore.kernel.org/20250425145811.822676841@goodmis.o= rg - Use (current->flags & PF_KTHREAD) instead of !(current->mm) for testing if a task is a kernel thread or not. (Josh Poimboeuf) MAINTAINERS | 8 +++++ arch/Kconfig | 3 ++ include/linux/unwind_user.h | 15 +++++++++ include/linux/unwind_user_types.h | 31 +++++++++++++++++ kernel/Makefile | 1 + kernel/unwind/Makefile | 1 + kernel/unwind/user.c | 55 +++++++++++++++++++++++++++++++ 7 files changed, 114 insertions(+) create mode 100644 include/linux/unwind_user.h create mode 100644 include/linux/unwind_user_types.h create mode 100644 kernel/unwind/Makefile create mode 100644 kernel/unwind/user.c diff --git a/MAINTAINERS b/MAINTAINERS index fedcbcba8397..f94b8d05543d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -25308,6 +25308,14 @@ F: Documentation/driver-api/uio-howto.rst F: drivers/uio/ F: include/linux/uio_driver.h =20 +USERSPACE STACK UNWINDING +M: Josh Poimboeuf +M: Steven Rostedt +S: Maintained +F: include/linux/unwind*.h +F: kernel/unwind/ + + UTIL-LINUX PACKAGE M: Karel Zak L: util-linux@vger.kernel.org diff --git a/arch/Kconfig b/arch/Kconfig index b0adb665041f..ccbcead9fac0 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -435,6 +435,9 @@ config HAVE_HARDLOCKUP_DETECTOR_ARCH It uses the same command line parameters, and sysctl interface, as the generic hardlockup detectors. =20 +config UNWIND_USER + bool + config HAVE_PERF_REGS bool help diff --git a/include/linux/unwind_user.h b/include/linux/unwind_user.h new file mode 100644 index 000000000000..aa7923c1384f --- /dev/null +++ b/include/linux/unwind_user.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_UNWIND_USER_H +#define _LINUX_UNWIND_USER_H + +#include + +int unwind_user_start(struct unwind_user_state *state); +int unwind_user_next(struct unwind_user_state *state); + +int unwind_user(struct unwind_stacktrace *trace, unsigned int max_entries); + +#define for_each_user_frame(state) \ + for (unwind_user_start((state)); !(state)->done; unwind_user_next((state)= )) + +#endif /* _LINUX_UNWIND_USER_H */ diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_= types.h new file mode 100644 index 000000000000..6ed1b4ae74e1 --- /dev/null +++ b/include/linux/unwind_user_types.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_UNWIND_USER_TYPES_H +#define _LINUX_UNWIND_USER_TYPES_H + +#include + +enum unwind_user_type { + UNWIND_USER_TYPE_NONE, +}; + +struct unwind_stacktrace { + unsigned int nr; + unsigned long *entries; +}; + +struct unwind_user_frame { + s32 cfa_off; + s32 ra_off; + s32 fp_off; + bool use_fp; +}; + +struct unwind_user_state { + unsigned long ip; + unsigned long sp; + unsigned long fp; + enum unwind_user_type type; + bool done; +}; + +#endif /* _LINUX_UNWIND_USER_TYPES_H */ diff --git a/kernel/Makefile b/kernel/Makefile index 434929de17ef..5a2b2be2a32d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -55,6 +55,7 @@ obj-y +=3D rcu/ obj-y +=3D livepatch/ obj-y +=3D dma/ obj-y +=3D entry/ +obj-y +=3D unwind/ obj-$(CONFIG_MODULES) +=3D module/ =20 obj-$(CONFIG_KCMP) +=3D kcmp.o diff --git a/kernel/unwind/Makefile b/kernel/unwind/Makefile new file mode 100644 index 000000000000..349ce3677526 --- /dev/null +++ b/kernel/unwind/Makefile @@ -0,0 +1 @@ + obj-$(CONFIG_UNWIND_USER) +=3D user.o diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c new file mode 100644 index 000000000000..d30449328981 --- /dev/null +++ b/kernel/unwind/user.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +/* +* Generic interfaces for unwinding user space +*/ +#include +#include +#include +#include + +int unwind_user_next(struct unwind_user_state *state) +{ + /* no implementation yet */ + return -EINVAL; +} + +int unwind_user_start(struct unwind_user_state *state) +{ + struct pt_regs *regs =3D task_pt_regs(current); + + memset(state, 0, sizeof(*state)); + + if ((current->flags & PF_KTHREAD) || !user_mode(regs)) { + state->done =3D true; + return -EINVAL; + } + + state->type =3D UNWIND_USER_TYPE_NONE; + + state->ip =3D instruction_pointer(regs); + state->sp =3D user_stack_pointer(regs); + state->fp =3D frame_pointer(regs); + + return 0; +} + +int unwind_user(struct unwind_stacktrace *trace, unsigned int max_entries) +{ + struct unwind_user_state state; + + trace->nr =3D 0; + + if (!max_entries) + return -EINVAL; + + if (current->flags & PF_KTHREAD) + return 0; + + for_each_user_frame(&state) { + trace->entries[trace->nr++] =3D state.ip; + if (trace->nr >=3D max_entries) + break; + } + + return 0; +} --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3285F28033F; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=DshsaL+AFqLdgQ0TIFkpgY244l88STyDcbaXEMkUUr2JLreJuhKZ/Wz2C+ylLzBdKszKiRSlZD1KewifMjwCzjhOS9DWtm0kcpatGSDbwEatQhoM4bW6sk7CjdZJtMmhldCuSgPmyzVCoFhwrj+UJhXVTJiNOxzQS0zeObH4zSU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=YQCIvf5LGT5XFWsFFOofkXUiq80zAe/dxuiLdDbLBHk=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=XbijrJXce8YiXuv6zAqVn+N1f5xkM3pqcx8cj6tSxuYfhOaPbZDdoJpxW25QjaW8qH3DnYbOrohbeSQVXuw7cytiXy482azipopAuqcv243HyeyK3NbT/mFqqZ50rD0lem0xItY5FAuYs6gYc3qPdM83v51EJccgO2u+V5JMGYE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id AC5E3C4CEF1; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcB-00000001dO2-44s8; Wed, 30 Apr 2025 16:01:07 -0400 Message-ID: <20250430200107.823480244@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:48 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 02/18] unwind_user: Add frame pointer support References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Add optional support for user space frame pointer unwinding. If supported, the arch needs to enable CONFIG_HAVE_UNWIND_USER_FP and define ARCH_INIT_USER_FP_FRAME. By encoding the frame offsets in struct unwind_user_frame, much of this code can also be reused for future unwinder implementations like sframe. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- arch/Kconfig | 4 +++ include/asm-generic/unwind_user.h | 9 ++++++ include/linux/unwind_user_types.h | 1 + kernel/unwind/user.c | 51 +++++++++++++++++++++++++++++-- 4 files changed, 63 insertions(+), 2 deletions(-) create mode 100644 include/asm-generic/unwind_user.h diff --git a/arch/Kconfig b/arch/Kconfig index ccbcead9fac0..0e3844c0e200 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -438,6 +438,10 @@ config HAVE_HARDLOCKUP_DETECTOR_ARCH config UNWIND_USER bool =20 +config HAVE_UNWIND_USER_FP + bool + select UNWIND_USER + config HAVE_PERF_REGS bool help diff --git a/include/asm-generic/unwind_user.h b/include/asm-generic/unwind= _user.h new file mode 100644 index 000000000000..832425502fb3 --- /dev/null +++ b/include/asm-generic/unwind_user.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_GENERIC_UNWIND_USER_H +#define _ASM_GENERIC_UNWIND_USER_H + +#ifndef ARCH_INIT_USER_FP_FRAME + #define ARCH_INIT_USER_FP_FRAME +#endif + +#endif /* _ASM_GENERIC_UNWIND_USER_H */ diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_= types.h index 6ed1b4ae74e1..65bd070eb6b0 100644 --- a/include/linux/unwind_user_types.h +++ b/include/linux/unwind_user_types.h @@ -6,6 +6,7 @@ =20 enum unwind_user_type { UNWIND_USER_TYPE_NONE, + UNWIND_USER_TYPE_FP, }; =20 struct unwind_stacktrace { diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c index d30449328981..0671a81494d3 100644 --- a/kernel/unwind/user.c +++ b/kernel/unwind/user.c @@ -6,10 +6,54 @@ #include #include #include +#include +#include + +static struct unwind_user_frame fp_frame =3D { + ARCH_INIT_USER_FP_FRAME +}; + +static inline bool fp_state(struct unwind_user_state *state) +{ + return IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP) && + state->type =3D=3D UNWIND_USER_TYPE_FP; +} =20 int unwind_user_next(struct unwind_user_state *state) { - /* no implementation yet */ + struct unwind_user_frame _frame; + struct unwind_user_frame *frame =3D &_frame; + unsigned long cfa =3D 0, fp, ra =3D 0; + + if (state->done) + return -EINVAL; + + if (fp_state(state)) + frame =3D &fp_frame; + else + goto the_end; + + cfa =3D (frame->use_fp ? state->fp : state->sp) + frame->cfa_off; + + /* stack going in wrong direction? */ + if (cfa <=3D state->sp) + goto the_end; + + if (get_user(ra, (unsigned long *)(cfa + frame->ra_off))) + goto the_end; + + if (frame->fp_off && get_user(fp, (unsigned long __user *)(cfa + frame->f= p_off))) + goto the_end; + + state->ip =3D ra; + state->sp =3D cfa; + if (frame->fp_off) + state->fp =3D fp; + + return 0; + +the_end: + state->done =3D true; return -EINVAL; } =20 @@ -24,7 +68,10 @@ int unwind_user_start(struct unwind_user_state *state) return -EINVAL; } =20 - state->type =3D UNWIND_USER_TYPE_NONE; + if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) + state->type =3D UNWIND_USER_TYPE_FP; + else + state->type =3D UNWIND_USER_TYPE_NONE; =20 state->ip =3D instruction_pointer(regs); state->sp =3D user_stack_pointer(regs); --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 455FD1E1E10; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=YnnJFjLD72mVQEYgYsGLZ2jBA8mFiJVVe6k/hRIpYHWuU+dgJp0/q0h84lUxQjlGH7U6m3+IecdFTLcDRr02GDoBKQsgGlgTXAYI9AI7M+4h/T2mo+dGhsh4QvKLbL8O8BwbZhd0EG4VZ+tWHA4+j4cnc+ycnKfBAv/sbThKzro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=Ivqqrdd6BFIWSz+UZ5uzPlJbABlGLJtKzmx89lxeAbI=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=uThhrR+gF7rTekN2vbPwY7Qw+9V1nYySdHTPT6k0aIGkUzSq4jYBNVQCFfUCl0qsEwqAy7f24prW1+NHPhG1QOkUkpIecpjSYL66IkAYfiDt38UGB4JH3VZ2UoIDXkNF5JS+uGPAiNiNdmfPn5gB/+7KF9o6Vo2hHsaRtaGnbgE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id C604CC4CEF6; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dOW-0alf; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200107.992453476@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:49 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 03/18] unwind_user/x86: Enable frame pointer unwinding on x86 References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Use ARCH_INIT_USER_FP_FRAME to describe how frame pointers are unwound on x86, and enable CONFIG_HAVE_UNWIND_USER_FP accordingly so the unwind_user interfaces can be used. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- arch/x86/Kconfig | 1 + arch/x86/include/asm/unwind_user.h | 11 +++++++++++ 2 files changed, 12 insertions(+) create mode 100644 arch/x86/include/asm/unwind_user.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index aeac63b11fc2..b5a85d2be5ee 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -301,6 +301,7 @@ config X86 select HAVE_SYSCALL_TRACEPOINTS select HAVE_UACCESS_VALIDATION if HAVE_OBJTOOL select HAVE_UNSTABLE_SCHED_CLOCK + select HAVE_UNWIND_USER_FP if X86_64 select HAVE_USER_RETURN_NOTIFIER select HAVE_GENERIC_VDSO select VDSO_GETRANDOM if X86_64 diff --git a/arch/x86/include/asm/unwind_user.h b/arch/x86/include/asm/unwi= nd_user.h new file mode 100644 index 000000000000..8597857bf896 --- /dev/null +++ b/arch/x86/include/asm/unwind_user.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_UNWIND_USER_H +#define _ASM_X86_UNWIND_USER_H + +#define ARCH_INIT_USER_FP_FRAME \ + .cfa_off =3D (s32)sizeof(long) * 2, \ + .ra_off =3D (s32)sizeof(long) * -1, \ + .fp_off =3D (s32)sizeof(long) * -2, \ + .use_fp =3D true, + +#endif /* _ASM_X86_UNWIND_USER_H */ --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F07C283FF0; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=ZSIbn5xQ+hRg1Hq/aOHOLFuDQHT1kQvXDqtVRU/Y8pFLrIGGwdH7M3u71XIM+YSqxbdYbB3md0puVs34/UfsZyLDwwFz2Ez9ADv1XkUDRpHdExJ4UNpgG9NsNal+PZSlO6LBMzambTqwyXsC2zKKT/5CP8sqvqoDErvjytY/EWk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=7I11mP3w18vX4htds3yiDDbcRgMIVKdNKXayRuKw4cg=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=XL+UMgZrt0Cr8AOvLx/gg6yJOSXWd5qzzZsAtgwSU2MnazWI1hiyTeXdES+BPDhbbt+944BOevAsi6kKejO27X3D1t4CgZ2mZsETdkoR361ShZk5K4eArdpJRc08jpItqQm3mZVK4TwzUk0cNsmADv/4Qs3zof/iEbUYMGEDn2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFF1BC4CEEA; Wed, 30 Apr 2025 20:01:03 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dP3-1IEj; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200108.162415077@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:50 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 04/18] perf/x86: Rename and move get_segment_base() and make it global References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf get_segment_base() will be used by the unwind_user code, so make it global and rename it so it doesn't conflict with a KVM function of the same name. As the function is no longer specific to perf, move it to ptrace.c as that seems to be a better location for a generic function like this. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- arch/x86/events/core.c | 44 ++++------------------------------- arch/x86/include/asm/ptrace.h | 2 ++ arch/x86/kernel/ptrace.c | 38 ++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 39 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 85eb0eb1b284..524a59d9c2c4 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -42,6 +42,7 @@ #include #include #include +#include #include =20 #include "perf_event.h" @@ -2807,41 +2808,6 @@ valid_user_frame(const void __user *fp, unsigned lon= g size) return __access_ok(fp, size); } =20 -static unsigned long get_segment_base(unsigned int segment) -{ - struct desc_struct *desc; - unsigned int idx =3D segment >> 3; - - if ((segment & SEGMENT_TI_MASK) =3D=3D SEGMENT_LDT) { -#ifdef CONFIG_MODIFY_LDT_SYSCALL - struct ldt_struct *ldt; - - /* - * If we're not in a valid context with a real (not just lazy) - * user mm, then don't even try. - */ - if (!nmi_uaccess_okay()) - return 0; - - /* IRQs are off, so this synchronizes with smp_store_release */ - ldt =3D smp_load_acquire(¤t->mm->context.ldt); - if (!ldt || idx >=3D ldt->nr_entries) - return 0; - - desc =3D &ldt->entries[idx]; -#else - return 0; -#endif - } else { - if (idx >=3D GDT_ENTRIES) - return 0; - - desc =3D raw_cpu_ptr(gdt_page.gdt) + idx; - } - - return get_desc_base(desc); -} - #ifdef CONFIG_UPROBES /* * Heuristic-based check if uprobe is installed at the function entry. @@ -2898,8 +2864,8 @@ perf_callchain_user32(struct pt_regs *regs, struct pe= rf_callchain_entry_ctx *ent if (user_64bit_mode(regs)) return 0; =20 - cs_base =3D get_segment_base(regs->cs); - ss_base =3D get_segment_base(regs->ss); + cs_base =3D segment_base_address(regs->cs); + ss_base =3D segment_base_address(regs->ss); =20 fp =3D compat_ptr(ss_base + regs->bp); pagefault_disable(); @@ -3018,11 +2984,11 @@ static unsigned long code_segment_base(struct pt_re= gs *regs) return 0x10 * regs->cs; =20 if (user_mode(regs) && regs->cs !=3D __USER_CS) - return get_segment_base(regs->cs); + return segment_base_address(regs->cs); #else if (user_mode(regs) && !user_64bit_mode(regs) && regs->cs !=3D __USER32_CS) - return get_segment_base(regs->cs); + return segment_base_address(regs->cs); #endif return 0; } diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 50f75467f73d..59357ec98e52 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -314,6 +314,8 @@ static __always_inline bool regs_irqs_disabled(struct p= t_regs *regs) return !(regs->flags & X86_EFLAGS_IF); } =20 +unsigned long segment_base_address(unsigned int segment); + /* Query offset/name of register from its name/offset */ extern int regs_query_register_offset(const char *name); extern const char *regs_query_register_name(unsigned int offset); diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 095f04bdabdc..81353a09701b 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -41,6 +41,7 @@ #include #include #include +#include =20 #include "tls.h" =20 @@ -339,6 +340,43 @@ static int set_segment_reg(struct task_struct *task, =20 #endif /* CONFIG_X86_32 */ =20 +unsigned long segment_base_address(unsigned int segment) +{ + struct desc_struct *desc; + unsigned int idx =3D segment >> 3; + + lockdep_assert_irqs_disabled(); + + if ((segment & SEGMENT_TI_MASK) =3D=3D SEGMENT_LDT) { +#ifdef CONFIG_MODIFY_LDT_SYSCALL + struct ldt_struct *ldt; + + /* + * If we're not in a valid context with a real (not just lazy) + * user mm, then don't even try. + */ + if (!nmi_uaccess_okay()) + return 0; + + /* IRQs are off, so this synchronizes with smp_store_release */ + ldt =3D smp_load_acquire(¤t->mm->context.ldt); + if (!ldt || idx >=3D ldt->nr_entries) + return 0; + + desc =3D &ldt->entries[idx]; +#else + return 0; +#endif + } else { + if (idx >=3D GDT_ENTRIES) + return 0; + + desc =3D raw_cpu_ptr(gdt_page.gdt) + idx; + } + + return get_desc_base(desc); +} + static unsigned long get_flags(struct task_struct *task) { unsigned long retval =3D task_pt_regs(task)->flags; --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4A442882BD; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=Fb+/IJYO+9El001FJdgW1n1sPPFlcih4jyiFcDfJqHXmvPJ5OrdfpKPpFevnCeWsJRoT5jtGwIhWefw02N4WuZWZvPtrkMVo6XqqV8brn0BWs3FCz9/JJ2t+cIOYHZIxsmV8t9sncFqVGWiFjuj2tmcoLy2tBgAr/4mRylrKoqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=/O8CQz956FRS/3j6BWFD/9/GKy/nCXhh68LzH+c4gKM=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=K/fMec1Qe3R2upC5+sc13dGeKgOT9agltlP+GZYhWPSkAjG2rgAl2ccEKQKdwRZeHueemuyW8GAgostS+8KPDwFCfekftkkCBcqx/A0V4pOMvQ8LIQphLlZZ03qR1A4zxUV+aNLY86CvRhyrFD9OAA5aqA7rCVOphN3GWCVvZxo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48135C4CEED; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dPX-20KY; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200108.328096472@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:51 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 05/18] unwind_user: Add compat mode frame pointer support References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Add optional support for user space compat mode frame pointer unwinding. If supported, the arch needs to enable CONFIG_HAVE_UNWIND_USER_COMPAT_FP and define ARCH_INIT_USER_COMPAT_FP_FRAME. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- arch/Kconfig | 4 +++ include/asm-generic/Kbuild | 2 ++ include/asm-generic/unwind_user.h | 15 +++++++++++ include/asm-generic/unwind_user_types.h | 9 +++++++ include/linux/unwind_user_types.h | 3 +++ kernel/unwind/user.c | 36 ++++++++++++++++++++++--- 6 files changed, 65 insertions(+), 4 deletions(-) create mode 100644 include/asm-generic/unwind_user_types.h diff --git a/arch/Kconfig b/arch/Kconfig index 0e3844c0e200..dbb1cc89e040 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -442,6 +442,10 @@ config HAVE_UNWIND_USER_FP bool select UNWIND_USER =20 +config HAVE_UNWIND_USER_COMPAT_FP + bool + depends on HAVE_UNWIND_USER_FP + config HAVE_PERF_REGS bool help diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild index 8675b7b4ad23..b797a2434396 100644 --- a/include/asm-generic/Kbuild +++ b/include/asm-generic/Kbuild @@ -59,6 +59,8 @@ mandatory-y +=3D tlbflush.h mandatory-y +=3D topology.h mandatory-y +=3D trace_clock.h mandatory-y +=3D uaccess.h +mandatory-y +=3D unwind_user.h +mandatory-y +=3D unwind_user_types.h mandatory-y +=3D vermagic.h mandatory-y +=3D vga.h mandatory-y +=3D video.h diff --git a/include/asm-generic/unwind_user.h b/include/asm-generic/unwind= _user.h index 832425502fb3..385638ce4aec 100644 --- a/include/asm-generic/unwind_user.h +++ b/include/asm-generic/unwind_user.h @@ -2,8 +2,23 @@ #ifndef _ASM_GENERIC_UNWIND_USER_H #define _ASM_GENERIC_UNWIND_USER_H =20 +#include + #ifndef ARCH_INIT_USER_FP_FRAME #define ARCH_INIT_USER_FP_FRAME #endif =20 +#ifndef ARCH_INIT_USER_COMPAT_FP_FRAME + #define ARCH_INIT_USER_COMPAT_FP_FRAME + #define in_compat_mode(regs) false +#endif + +#ifndef arch_unwind_user_init +static inline void arch_unwind_user_init(struct unwind_user_state *state, = struct pt_regs *reg) {} +#endif + +#ifndef arch_unwind_user_next +static inline void arch_unwind_user_next(struct unwind_user_state *state) = {} +#endif + #endif /* _ASM_GENERIC_UNWIND_USER_H */ diff --git a/include/asm-generic/unwind_user_types.h b/include/asm-generic/= unwind_user_types.h new file mode 100644 index 000000000000..ee803de7c998 --- /dev/null +++ b/include/asm-generic/unwind_user_types.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_GENERIC_UNWIND_USER_TYPES_H +#define _ASM_GENERIC_UNWIND_USER_TYPES_H + +#ifndef arch_unwind_user_state +struct arch_unwind_user_state {}; +#endif + +#endif /* _ASM_GENERIC_UNWIND_USER_TYPES_H */ diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_= types.h index 65bd070eb6b0..3ec4a097a3dd 100644 --- a/include/linux/unwind_user_types.h +++ b/include/linux/unwind_user_types.h @@ -3,10 +3,12 @@ #define _LINUX_UNWIND_USER_TYPES_H =20 #include +#include =20 enum unwind_user_type { UNWIND_USER_TYPE_NONE, UNWIND_USER_TYPE_FP, + UNWIND_USER_TYPE_COMPAT_FP, }; =20 struct unwind_stacktrace { @@ -25,6 +27,7 @@ struct unwind_user_state { unsigned long ip; unsigned long sp; unsigned long fp; + struct arch_unwind_user_state arch; enum unwind_user_type type; bool done; }; diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c index 0671a81494d3..635cc04bb299 100644 --- a/kernel/unwind/user.c +++ b/kernel/unwind/user.c @@ -13,12 +13,32 @@ static struct unwind_user_frame fp_frame =3D { ARCH_INIT_USER_FP_FRAME }; =20 +static struct unwind_user_frame compat_fp_frame =3D { + ARCH_INIT_USER_COMPAT_FP_FRAME +}; + static inline bool fp_state(struct unwind_user_state *state) { return IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP) && state->type =3D=3D UNWIND_USER_TYPE_FP; } =20 +static inline bool compat_state(struct unwind_user_state *state) +{ + return IS_ENABLED(CONFIG_HAVE_UNWIND_USER_COMPAT_FP) && + state->type =3D=3D UNWIND_USER_TYPE_COMPAT_FP; +} + +#define UNWIND_GET_USER_LONG(to, from, state) \ +({ \ + int __ret; \ + if (compat_state(state)) \ + __ret =3D get_user(to, (u32 __user *)(from)); \ + else \ + __ret =3D get_user(to, (u64 __user *)(from)); \ + __ret; \ +}) + int unwind_user_next(struct unwind_user_state *state) { struct unwind_user_frame _frame; @@ -28,7 +48,9 @@ int unwind_user_next(struct unwind_user_state *state) if (state->done) return -EINVAL; =20 - if (fp_state(state)) + if (compat_state(state)) + frame =3D &compat_fp_frame; + else if (fp_state(state)) frame =3D &fp_frame; else goto the_end; @@ -39,10 +61,10 @@ int unwind_user_next(struct unwind_user_state *state) if (cfa <=3D state->sp) goto the_end; =20 - if (get_user(ra, (unsigned long *)(cfa + frame->ra_off))) + if (UNWIND_GET_USER_LONG(ra, cfa + frame->ra_off, state)) goto the_end; =20 - if (frame->fp_off && get_user(fp, (unsigned long __user *)(cfa + frame->f= p_off))) + if (frame->fp_off && UNWIND_GET_USER_LONG(fp, cfa + frame->fp_off, state)) goto the_end; =20 state->ip =3D ra; @@ -50,6 +72,8 @@ int unwind_user_next(struct unwind_user_state *state) if (frame->fp_off) state->fp =3D fp; =20 + arch_unwind_user_next(state); + return 0; =20 the_end: @@ -68,7 +92,9 @@ int unwind_user_start(struct unwind_user_state *state) return -EINVAL; } =20 - if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) + if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_COMPAT_FP) && in_compat_mode(regs)) + state->type =3D UNWIND_USER_TYPE_COMPAT_FP; + else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) state->type =3D UNWIND_USER_TYPE_FP; else state->type =3D UNWIND_USER_TYPE_NONE; @@ -77,6 +103,8 @@ int unwind_user_start(struct unwind_user_state *state) state->sp =3D user_stack_pointer(regs); state->fp =3D frame_pointer(regs); =20 + arch_unwind_user_init(state, regs); + return 0; } =20 --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4B162882C2; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=cX9Sja1Lr+aL2kyRJQWg9FnFYT3g7gp2+teMM7B3AEpDm4M2S87aynV/pl+s2giFG8A4jAx+bKXxGG1+06EuVNGrgfNzhjRw4ZCXZffWTg6tikAhLxUhIBqCMz/Bc+hma7iHp0TxZIgkcdZ0tF13qT/kWgJhOhy+NaWSjLLKZ48= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=eATf4fZeNp3RnCSTULXp/Dl7q+fVeO6RbWAUiLtTQQY=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=PYAi8/wY6kPHvCaFHRenmiZBLlZ7bIo2xH36yz48GmRFx83vi9UMJ08BQakX1n4xXb6YxhXKzH1muk3maYjh7Q6A7aZW8N52RnJcWUekez2yJ2dnajizRGMlnrzVTMpazPunPPPs7+AAlOwi4jmUjmUrH5Czx+8xhMBr8alZTLQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5503AC4CEE7; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dQ1-2jDn; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200108.499313094@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:52 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 06/18] unwind_user/x86: Enable compat mode frame pointer unwinding on x86 References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Use ARCH_INIT_USER_COMPAT_FP_FRAME to describe how frame pointers are unwound on x86, and implement the hooks needed to add the segment base addresses. Enable HAVE_UNWIND_USER_COMPAT_FP if the system has compat mode compiled in. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- arch/x86/Kconfig | 1 + arch/x86/include/asm/unwind_user.h | 50 ++++++++++++++++++++++++ arch/x86/include/asm/unwind_user_types.h | 17 ++++++++ 3 files changed, 68 insertions(+) create mode 100644 arch/x86/include/asm/unwind_user_types.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b5a85d2be5ee..35d3b01b65c6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -301,6 +301,7 @@ config X86 select HAVE_SYSCALL_TRACEPOINTS select HAVE_UACCESS_VALIDATION if HAVE_OBJTOOL select HAVE_UNSTABLE_SCHED_CLOCK + select HAVE_UNWIND_USER_COMPAT_FP if IA32_EMULATION select HAVE_UNWIND_USER_FP if X86_64 select HAVE_USER_RETURN_NOTIFIER select HAVE_GENERIC_VDSO diff --git a/arch/x86/include/asm/unwind_user.h b/arch/x86/include/asm/unwi= nd_user.h index 8597857bf896..bb1148111259 100644 --- a/arch/x86/include/asm/unwind_user.h +++ b/arch/x86/include/asm/unwind_user.h @@ -2,10 +2,60 @@ #ifndef _ASM_X86_UNWIND_USER_H #define _ASM_X86_UNWIND_USER_H =20 +#include +#include +#include + #define ARCH_INIT_USER_FP_FRAME \ .cfa_off =3D (s32)sizeof(long) * 2, \ .ra_off =3D (s32)sizeof(long) * -1, \ .fp_off =3D (s32)sizeof(long) * -2, \ .use_fp =3D true, =20 +#ifdef CONFIG_IA32_EMULATION + +#define ARCH_INIT_USER_COMPAT_FP_FRAME \ + .cfa_off =3D (s32)sizeof(u32) * 2, \ + .ra_off =3D (s32)sizeof(u32) * -1, \ + .fp_off =3D (s32)sizeof(u32) * -2, \ + .use_fp =3D true, + +#define in_compat_mode(regs) !user_64bit_mode(regs) + +static inline void arch_unwind_user_init(struct unwind_user_state *state, + struct pt_regs *regs) +{ + unsigned long cs_base, ss_base; + + if (state->type !=3D UNWIND_USER_TYPE_COMPAT_FP) + return; + + scoped_guard(irqsave) { + cs_base =3D segment_base_address(regs->cs); + ss_base =3D segment_base_address(regs->ss); + } + + state->arch.cs_base =3D cs_base; + state->arch.ss_base =3D ss_base; + + state->ip +=3D cs_base; + state->sp +=3D ss_base; + state->fp +=3D ss_base; +} +#define arch_unwind_user_init arch_unwind_user_init + +static inline void arch_unwind_user_next(struct unwind_user_state *state) +{ + if (state->type !=3D UNWIND_USER_TYPE_COMPAT_FP) + return; + + state->ip +=3D state->arch.cs_base; + state->fp +=3D state->arch.ss_base; +} +#define arch_unwind_user_next arch_unwind_user_next + +#endif /* CONFIG_IA32_EMULATION */ + +#include + #endif /* _ASM_X86_UNWIND_USER_H */ diff --git a/arch/x86/include/asm/unwind_user_types.h b/arch/x86/include/as= m/unwind_user_types.h new file mode 100644 index 000000000000..d7074dc5f0ce --- /dev/null +++ b/arch/x86/include/asm/unwind_user_types.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_UNWIND_USER_TYPES_H +#define _ASM_UNWIND_USER_TYPES_H + +#ifdef CONFIG_IA32_EMULATION + +struct arch_unwind_user_state { + unsigned long ss_base; + unsigned long cs_base; +}; +#define arch_unwind_user_state arch_unwind_user_state + +#endif /* CONFIG_IA32_EMULATION */ + +#include + +#endif /* _ASM_UNWIND_USER_TYPES_H */ --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1A492882BC; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=UGWw4gGnJcHL/afNIopxijF81/pCICVuaABMwsRh/U5eNpsMeQt91AsRwCrWC2yDvfi9ryVpEurTix69Y7ULS5hwGZNUKqR4iQZTnayaDmrBvLmdQd+Go3maBr33+KQaJhtk0zgW2o8I6guvLIDXCGsIdUJBhvsCS+pEix2dNaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=pMmMfZzOnxVYEWJ7fAqc5BkB5un/2j8V70xiTI9e5iA=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=RR94DYO5gmkBrWii2o5NXoU3zitWwUMFCztfUma5cf1Ie+10Rb1rY8BcXSPq5IZTtoODNw0ZOHGIs1Nla2H7iEooPTczDEIDzOy9zzm/xScW+I4fTJ2Jjd8ReUSwKT2o2HOGn49SVdWNeV79ZdYqPWP7oLNhNCuRv3ixs17NC/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88C43C4CEEC; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dQY-3R59; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200108.671429827@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:53 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 07/18] unwind_user/deferred: Add unwind_deferred_trace() References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt Add a function that must be called inside a faultable context that will retrieve a user space stack trace. The function unwind_deferred_trace() can be called by a tracer when a task is about to enter user space, or has just come back from user space and has interrupts enabled. This code is based on work by Josh Poimboeuf's deferred unwinding code: Link: https://lore.kernel.org/all/6052e8487746603bdb29b65f4033e739092d9925.= 1737511963.git.jpoimboe@kernel.org/ Signed-off-by: Steven Rostedt (Google) --- Changes since v6: https://lore.kernel.org/20250425145812.835672647@goodmis.= org - Use (current->flags & PF_EXITING) instead of checking !current->mm include/linux/sched.h | 5 +++ include/linux/unwind_deferred.h | 24 ++++++++++++++ include/linux/unwind_deferred_types.h | 9 +++++ kernel/fork.c | 4 +++ kernel/unwind/Makefile | 2 +- kernel/unwind/deferred.c | 48 +++++++++++++++++++++++++++ 6 files changed, 91 insertions(+), 1 deletion(-) create mode 100644 include/linux/unwind_deferred.h create mode 100644 include/linux/unwind_deferred_types.h create mode 100644 kernel/unwind/deferred.c diff --git a/include/linux/sched.h b/include/linux/sched.h index 4ecc0c6b1cb0..a1e1c07cadfb 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -47,6 +47,7 @@ #include #include #include +#include #include =20 /* task_struct member predeclarations (sorted alphabetically): */ @@ -1646,6 +1647,10 @@ struct task_struct { struct user_event_mm *user_event_mm; #endif =20 +#ifdef CONFIG_UNWIND_USER + struct unwind_task_info unwind_info; +#endif + /* CPU-specific state of this task: */ struct thread_struct thread; =20 diff --git a/include/linux/unwind_deferred.h b/include/linux/unwind_deferre= d.h new file mode 100644 index 000000000000..5064ebe38c4f --- /dev/null +++ b/include/linux/unwind_deferred.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_UNWIND_USER_DEFERRED_H +#define _LINUX_UNWIND_USER_DEFERRED_H + +#include +#include + +#ifdef CONFIG_UNWIND_USER + +void unwind_task_init(struct task_struct *task); +void unwind_task_free(struct task_struct *task); + +int unwind_deferred_trace(struct unwind_stacktrace *trace); + +#else /* !CONFIG_UNWIND_USER */ + +static inline void unwind_task_init(struct task_struct *task) {} +static inline void unwind_task_free(struct task_struct *task) {} + +static inline int unwind_deferred_trace(struct unwind_stacktrace *trace) {= return -ENOSYS; } + +#endif /* !CONFIG_UNWIND_USER */ + +#endif /* _LINUX_UNWIND_USER_DEFERRED_H */ diff --git a/include/linux/unwind_deferred_types.h b/include/linux/unwind_d= eferred_types.h new file mode 100644 index 000000000000..aa32db574e43 --- /dev/null +++ b/include/linux/unwind_deferred_types.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_UNWIND_USER_DEFERRED_TYPES_H +#define _LINUX_UNWIND_USER_DEFERRED_TYPES_H + +struct unwind_task_info { + unsigned long *entries; +}; + +#endif /* _LINUX_UNWIND_USER_DEFERRED_TYPES_H */ diff --git a/kernel/fork.c b/kernel/fork.c index c4b26cd8998b..8c79c7c2c553 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -105,6 +105,7 @@ #include #include #include +#include =20 #include #include @@ -991,6 +992,7 @@ void __put_task_struct(struct task_struct *tsk) WARN_ON(refcount_read(&tsk->usage)); WARN_ON(tsk =3D=3D current); =20 + unwind_task_free(tsk); sched_ext_free(tsk); io_uring_free(tsk); cgroup_free(tsk); @@ -2395,6 +2397,8 @@ __latent_entropy struct task_struct *copy_process( p->bpf_ctx =3D NULL; #endif =20 + unwind_task_init(p); + /* Perform scheduler related setup. Assign this task to a CPU. */ retval =3D sched_fork(clone_flags, p); if (retval) diff --git a/kernel/unwind/Makefile b/kernel/unwind/Makefile index 349ce3677526..6752ac96d7e2 100644 --- a/kernel/unwind/Makefile +++ b/kernel/unwind/Makefile @@ -1 +1 @@ - obj-$(CONFIG_UNWIND_USER) +=3D user.o + obj-$(CONFIG_UNWIND_USER) +=3D user.o deferred.o diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c new file mode 100644 index 000000000000..5a3789e38c00 --- /dev/null +++ b/kernel/unwind/deferred.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Deferred user space unwinding + */ +#include +#include +#include +#include + +#define UNWIND_MAX_ENTRIES 512 + +int unwind_deferred_trace(struct unwind_stacktrace *trace) +{ + struct unwind_task_info *info =3D ¤t->unwind_info; + + /* Should always be called from faultable context */ + might_fault(); + + if (current->flags & PF_EXITING) + return -EINVAL; + + if (!info->entries) { + info->entries =3D kmalloc_array(UNWIND_MAX_ENTRIES, sizeof(= long), + GFP_KERNEL); + if (!info->entries) + return -ENOMEM; + } + + trace->nr =3D 0; + trace->entries =3D info->entries; + unwind_user(trace, UNWIND_MAX_ENTRIES); + + return 0; +} + +void unwind_task_init(struct task_struct *task) +{ + struct unwind_task_info *info =3D &task->unwind_info; + + memset(info, 0, sizeof(*info)); +} + +void unwind_task_free(struct task_struct *task) +{ + struct unwind_task_info *info =3D &task->unwind_info; + + kfree(info->entries); +} --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6E0129345D; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; cv=none; b=iVJ2oHFvdxE6ajyG/e1IOIAoCwctSZ7KAfLCkVic8hR5eoPil0LZ4RDdqnRCl5tnV7uVaZs+ktTLPlSoFpzxw/88dM4R7KGvmCDsLndJ7XSYOMTyYXUG6l1Va6fnLbA3UXcmXx6faRsHT04Oti1AlkIrJl7qDC+rmFjFJ7cCo2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043264; c=relaxed/simple; bh=mIyumrKmuS3VM7JNABw/AH++9n72b21yq81MJ7iQubg=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=dp8q8l8iQXmvBz/Fb51qXYLACvo21Hd3tHKiM7lv2EDvs8OLjTGJuIhoNszcQghYqCe14p2iwYenALDofhGsAL2Pf30LLmeBaEeTRQlG96GGIRNyQIkh9okJBHOhJQp8uYtU9LKpMc7bl4ZzrP+KkU+WtHzqzOJXTAccmwGHXwQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8894C4CEEF; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcC-00000001dR5-48Kk; Wed, 30 Apr 2025 16:01:08 -0400 Message-ID: <20250430200108.838930038@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:54 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 08/18] unwind_user/deferred: Add unwind cache References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Cache the results of the unwind to ensure the unwind is only performed once, even when called by multiple tracers. The cache nr_entries gets cleared every time the task exits the kernel. When a stacktrace is requested, nr_entries gets set to the number of entries in the stacktrace. If another stacktrace is requested, if nr_entries is not zero, then it contains the same stacktrace that would be retrieved so it is not processed again and the entries is given to the caller. Co-developed-by: Steven Rostedt (Google) Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- include/linux/entry-common.h | 2 ++ include/linux/unwind_deferred.h | 7 +++++++ include/linux/unwind_deferred_types.h | 7 ++++++- kernel/unwind/deferred.c | 27 ++++++++++++++++++++------- 4 files changed, 35 insertions(+), 8 deletions(-) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index fc61d0205c97..725ec0e87cdd 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -12,6 +12,7 @@ #include #include #include +#include =20 #include =20 @@ -361,6 +362,7 @@ static __always_inline void exit_to_user_mode(void) lockdep_hardirqs_on_prepare(); instrumentation_end(); =20 + unwind_exit_to_user_mode(); user_enter_irqoff(); arch_exit_to_user_mode(); lockdep_hardirqs_on(CALLER_ADDR0); diff --git a/include/linux/unwind_deferred.h b/include/linux/unwind_deferre= d.h index 5064ebe38c4f..c2d760e5e257 100644 --- a/include/linux/unwind_deferred.h +++ b/include/linux/unwind_deferred.h @@ -12,6 +12,11 @@ void unwind_task_free(struct task_struct *task); =20 int unwind_deferred_trace(struct unwind_stacktrace *trace); =20 +static __always_inline void unwind_exit_to_user_mode(void) +{ + current->unwind_info.cache.nr_entries =3D 0; +} + #else /* !CONFIG_UNWIND_USER */ =20 static inline void unwind_task_init(struct task_struct *task) {} @@ -19,6 +24,8 @@ static inline void unwind_task_free(struct task_struct *t= ask) {} =20 static inline int unwind_deferred_trace(struct unwind_stacktrace *trace) {= return -ENOSYS; } =20 +static inline void unwind_exit_to_user_mode(void) {} + #endif /* !CONFIG_UNWIND_USER */ =20 #endif /* _LINUX_UNWIND_USER_DEFERRED_H */ diff --git a/include/linux/unwind_deferred_types.h b/include/linux/unwind_d= eferred_types.h index aa32db574e43..b3b7389ee6eb 100644 --- a/include/linux/unwind_deferred_types.h +++ b/include/linux/unwind_deferred_types.h @@ -2,8 +2,13 @@ #ifndef _LINUX_UNWIND_USER_DEFERRED_TYPES_H #define _LINUX_UNWIND_USER_DEFERRED_TYPES_H =20 -struct unwind_task_info { +struct unwind_cache { unsigned long *entries; + unsigned int nr_entries; +}; + +struct unwind_task_info { + struct unwind_cache cache; }; =20 #endif /* _LINUX_UNWIND_USER_DEFERRED_TYPES_H */ diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c index 5a3789e38c00..89ed04b1c527 100644 --- a/kernel/unwind/deferred.c +++ b/kernel/unwind/deferred.c @@ -12,6 +12,7 @@ int unwind_deferred_trace(struct unwind_stacktrace *trace) { struct unwind_task_info *info =3D ¤t->unwind_info; + struct unwind_cache *cache =3D &info->cache; =20 /* Should always be called from faultable context */ might_fault(); @@ -19,17 +20,29 @@ int unwind_deferred_trace(struct unwind_stacktrace *tra= ce) if (current->flags & PF_EXITING) return -EINVAL; =20 - if (!info->entries) { - info->entries =3D kmalloc_array(UNWIND_MAX_ENTRIES, sizeof(= long), - GFP_KERNEL); - if (!info->entries) - return -ENOMEM; + if (!cache->entries) { + cache->entries =3D kmalloc_array(UNWIND_MAX_ENTRIES, sizeof(long), + GFP_KERNEL); + if (!cache->entries) + return -ENOMEM; + } + + trace->entries =3D cache->entries; + + if (cache->nr_entries) { + /* + * The user stack has already been previously unwound in th= is + * entry context. Skip the unwind and use the cache. + */ + trace->nr =3D cache->nr_entries; + return 0; } =20 trace->nr =3D 0; - trace->entries =3D info->entries; unwind_user(trace, UNWIND_MAX_ENTRIES); =20 + cache->nr_entries =3D trace->nr; + return 0; } =20 @@ -44,5 +57,5 @@ void unwind_task_free(struct task_struct *task) { struct unwind_task_info *info =3D &task->unwind_info; =20 - kfree(info->entries); + kfree(info->cache.entries); } --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E62A629372E; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; cv=none; b=ZNCVcM71HSIAnRyqgSAl97/mYuE16wgGLsdj+1xDGXsjvqV37Ynm/Eo6vXSrFiNC9Gfk9TImiHOpuzjAqkF9Kih/6I5mUgErWLDmT7nZRV0J25YRoICyOaWO/Gl0d0r/psN02ypdR3ZlxMiV2TTe9KhM8aOBAzYkc/hbuae2lIg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; c=relaxed/simple; bh=bYoE3lWyc/7+xI2i1oiE1igTNM0T6KYUsdoBK92sjnU=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=sAWLAxOzsBYALELKOUSKWcydk1sOTyUXNmVClMdKnlQPEkcDrVKtShxpOXjkkneWS0/Bh2k7NQlIMwXkET7VqHyJvuEBjTIE7fRBd3zn0k7WGAZ4swHEG9u6+TEZTc2HlDOVRj73I1lkw/kyH0yynN+exWxW54ZW23OGZrTWbZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id CC709C4CEEC; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dRa-0e4d; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.007677803@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:55 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring , Namhyung Kim Subject: [PATCH v7 09/18] perf: Remove get_perf_callchain() init_nr argument References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf The 'init_nr' argument has double duty: it's used to initialize both the number of contexts and the number of stack entries. That's confusing and the callers always pass zero anyway. Hard code the zero. Acked-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- include/linux/perf_event.h | 2 +- kernel/bpf/stackmap.c | 4 ++-- kernel/events/callchain.c | 12 ++++++------ kernel/events/core.c | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 947ad12dfdbe..3cc0b0ea0afa 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1651,7 +1651,7 @@ DECLARE_PER_CPU(struct perf_callchain_entry, perf_cal= lchain_entry); extern void perf_callchain_user(struct perf_callchain_entry_ctx *entry, st= ruct pt_regs *regs); extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, = struct pt_regs *regs); extern struct perf_callchain_entry * -get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool us= er, +get_perf_callchain(struct pt_regs *regs, bool kernel, bool user, u32 max_stack, bool crosstask, bool add_mark); extern int get_callchain_buffers(int max_stack); extern void put_callchain_buffers(void); diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 3615c06b7dfa..ec3a57a5fba1 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -314,7 +314,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, str= uct bpf_map *, map, if (max_depth > sysctl_perf_event_max_stack) max_depth =3D sysctl_perf_event_max_stack; =20 - trace =3D get_perf_callchain(regs, 0, kernel, user, max_depth, + trace =3D get_perf_callchain(regs, kernel, user, max_depth, false, false); =20 if (unlikely(!trace)) @@ -451,7 +451,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struc= t task_struct *task, else if (kernel && task) trace =3D get_callchain_entry_for_task(task, max_depth); else - trace =3D get_perf_callchain(regs, 0, kernel, user, max_depth, + trace =3D get_perf_callchain(regs, kernel, user, max_depth, crosstask, false); =20 if (unlikely(!trace) || trace->nr < skip) { diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index 6c83ad674d01..b0f5bd228cd8 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -217,7 +217,7 @@ static void fixup_uretprobe_trampoline_entries(struct p= erf_callchain_entry *entr } =20 struct perf_callchain_entry * -get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool us= er, +get_perf_callchain(struct pt_regs *regs, bool kernel, bool user, u32 max_stack, bool crosstask, bool add_mark) { struct perf_callchain_entry *entry; @@ -228,11 +228,11 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr,= bool kernel, bool user, if (!entry) return NULL; =20 - ctx.entry =3D entry; - ctx.max_stack =3D max_stack; - ctx.nr =3D entry->nr =3D init_nr; - ctx.contexts =3D 0; - ctx.contexts_maxed =3D false; + ctx.entry =3D entry; + ctx.max_stack =3D max_stack; + ctx.nr =3D entry->nr =3D 0; + ctx.contexts =3D 0; + ctx.contexts_maxed =3D false; =20 if (kernel && !user_mode(regs)) { if (add_mark) diff --git a/kernel/events/core.c b/kernel/events/core.c index 3c69a1a3f41c..67581babe9ba 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8110,7 +8110,7 @@ perf_callchain(struct perf_event *event, struct pt_re= gs *regs) if (!kernel && !user) return &__empty_callchain; =20 - callchain =3D get_perf_callchain(regs, 0, kernel, user, + callchain =3D get_perf_callchain(regs, kernel, user, max_stack, crosstask, true); return callchain ?: &__empty_callchain; } --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E2D6294A1B; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; cv=none; b=RC8FR1BPQ5JPZNqKr0FtXnHWzWc0Ba5RWDxuVTfYh63pVUYkay3RZG2yZDqmA+09ricQEZG035DVq1FsRv8RnjmPKshR6tKiKfi6cqujOAFJNfInR3g4fzZApCFaecibrEig1QbrL8JC1ABJCeVYXqpxg0Vo5WnAqK/i4wk/FOA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; c=relaxed/simple; bh=6DE7KZOXrlLO0p0yUtPCwZQ+NilkphXCklG8RJnYqCw=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=CjplJyVnH+BB7T7rtX15mdLJk3gF8VhS1KkTkPNHDr2A/tg0QStvI1oYC8/x8F/fl/KPn2sVjyyyK35fusbw7Pc7g+G55QCH7OeNlm0rpYDho5zb99eW9O41neb7wsVkJtxwjTNUF+GypO8XtfmhsvJC1oLYz+yr4zD6NlATvJU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id F3A0AC4CEED; Wed, 30 Apr 2025 20:01:04 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dS5-1Lub; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.173686252@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:56 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 10/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf get_perf_callchain() doesn't support cross-task unwinding for user space stacks, have it return NULL if both the crosstask and user arguments are set. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- kernel/events/callchain.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index b0f5bd228cd8..abf258913ab6 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, = bool user, struct perf_callchain_entry_ctx ctx; int rctx, start_entry_idx; =20 + /* crosstask is not supported for user stacks */ + if (crosstask && user) + return NULL; + entry =3D get_callchain_entry(&rctx); if (!entry) return NULL; @@ -249,9 +253,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, b= ool user, } =20 if (regs) { - if (crosstask) - goto exit_put; - if (add_mark) perf_callchain_store_context(&ctx, PERF_CONTEXT_USER); =20 @@ -261,7 +262,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, b= ool user, } } =20 -exit_put: put_callchain_entry(rctx); =20 return entry; --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8EEC02951A0; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; cv=none; b=G3olADn9HJ13Dfsxh6kkU9KDgXc+ftZ0cPwE9uU9/sUDIIVZ6RAKsUVIXN9ppL8O7ER6wxPjAULh98+E+zjMssTZCI1lkQvNhM2GSTYhDP4hDD7xjo4sIngIuuQ95ec8Nwx/Yy4llhOZHFUz+H33+UOVvM9+Qy6HOsU4Y8tQqCU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; c=relaxed/simple; bh=UruFrykajNjxMX/0A6fTNZ1MktM/smIbdfB+M9Tloe0=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=fNhsuU7H6yXr81NUgaE1aGq0hqx5YjBl4KIuwgD61KMS/BE1z1EcDoKbkSMSr71Zxcpb/EY867Sgf9meDqsZcc7VMd7AIzpmhFJsuICwIRcuczH+7J4bWTTveCqS4MkyyDsXg0w+UIGXBgl9FL9KbseC8lfwN6/poH9lSnDYhu4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 29566C4CEEC; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dSb-23nB; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.343166050@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:57 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 11/18] perf: Use current->flags & PF_KTHREAD instead of current->mm == NULL References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Steven Rostedt To determine if a task is a kernel thread or not, it is more reliable to use (current->flags & PF_KTHREAD) than to rely on current->mm being NULL. That is because some kernel tasks (io_uring helpers) may have a mm field. Link: https://lore.kernel.org/linux-trace-kernel/20250424163607.GE18306@noi= sy.programming.kicks-ass.net/ Signed-off-by: Steven Rostedt (Google) --- kernel/events/callchain.c | 6 +++--- kernel/events/core.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index abf258913ab6..cda145dc11bd 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -246,10 +246,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel,= bool user, =20 if (user) { if (!user_mode(regs)) { - if (current->mm) - regs =3D task_pt_regs(current); - else + if (current->flags & PF_KTHREAD) regs =3D NULL; + else + regs =3D task_pt_regs(current); } =20 if (regs) { diff --git a/kernel/events/core.c b/kernel/events/core.c index 67581babe9ba..430dd158b1ee 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7989,7 +7989,7 @@ static u64 perf_virt_to_phys(u64 virt) * Try IRQ-safe get_user_page_fast_only first. * If failed, leave phys_addr as 0. */ - if (current->mm !=3D NULL) { + if (!(current->flags & PF_KTHREAD)) { struct page *p; =20 pagefault_disable(); --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76BE92949FB; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; cv=none; b=n5HeAgwAw+wfrWJMeUxAxdn0TAskH0y2q9MSEZrGmzoEdrV2gn3UpYl5u7N8VKafZVaYE7ADx628+AEGIvV8nADF8833McQejiUsGU9soKKx3MqP1Q4n6TVBMRJOw3Z1AfkNafprqlE2OP1DCslhBqNBDgDerWVsGDYG3wMNTic= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; c=relaxed/simple; bh=mggx5smXanCvg7n8gnR/FploZ9NPITBU1NWvDKMpMos=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=NJ8MAf9cp7y7yEPGF7hHAAr9N3oyH50rtf2f5kjV2OQbWsV2ujA557+XyGun1ZfbMbz4KyEeykEXMw1Gpc/zI6xOA+gHGNEF4E1OaADlb6+d2INi76GZCqktm9N37hkjT0mrT8uUm2MUi7KFBkWPyIbZRLvs5zR0PhOlkrnmKrY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 570FFC4CEE7; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dT5-2lNA; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.512005059@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:58 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 12/18] perf: Simplify get_perf_callchain() user logic References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Simplify the get_perf_callchain() user logic a bit. task_pt_regs() should never be NULL. Acked-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- kernel/events/callchain.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index cda145dc11bd..2798c0c9f782 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -247,21 +247,19 @@ get_perf_callchain(struct pt_regs *regs, bool kernel,= bool user, if (user) { if (!user_mode(regs)) { if (current->flags & PF_KTHREAD) - regs =3D NULL; - else - regs =3D task_pt_regs(current); + goto exit_put; + regs =3D task_pt_regs(current); } =20 - if (regs) { - if (add_mark) - perf_callchain_store_context(&ctx, PERF_CONTEXT_USER); + if (add_mark) + perf_callchain_store_context(&ctx, PERF_CONTEXT_USER); =20 - start_entry_idx =3D entry->nr; - perf_callchain_user(&ctx, regs); - fixup_uretprobe_trampoline_entries(entry, start_entry_idx); - } + start_entry_idx =3D entry->nr; + perf_callchain_user(&ctx, regs); + fixup_uretprobe_trampoline_entries(entry, start_entry_idx); } =20 +exit_put: put_callchain_entry(rctx); =20 return entry; --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A91EE295500; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; cv=none; b=RCdlJoWDHAfK0okSD+8du54a5nEGotSd6+arxgTgGuyAa6CZImAiVD0ujzzSfvnCit/Q8oyraNQ1H0wPU9D8/5quz8NDdxmEWtHy3pSZ2JesmlqW/quStFdLf6Yc1MxhnVXfQvdQB/l7+D3aQ96yt5wGrx5MvICIYaUudslLrMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043265; c=relaxed/simple; bh=wuNRjbmyz6MHvkwS+XtlrPebpFzfrgYzdjwYBw3T+s0=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=PmoJzZQDKTnESk4Yj3WWuibRCF1gV80QnoyPdANnFT06rzuawn5fP4F3+ZLUilMYEJmyckh5eFlu2hqS1k9FjGiWSuS5nUfqq5h5YA2HbbgS5RqiQn6Qnm51jLmQYJLXbG/KjI/WrZxoyzwav6wHbR5yUzX0ozvjPCJc2pla9bI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C17EC4CEEE; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dTb-3SKR; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.678557621@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:57:59 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 13/18] perf: Skip user unwind if the task is a kernel thread. References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf If the task is not a user thread, there's no user stack to unwind. Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- kernel/events/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 430dd158b1ee..ec9edf602974 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8101,7 +8101,8 @@ struct perf_callchain_entry * perf_callchain(struct perf_event *event, struct pt_regs *regs) { bool kernel =3D !event->attr.exclude_callchain_kernel; - bool user =3D !event->attr.exclude_callchain_user; + bool user =3D !event->attr.exclude_callchain_user && + !(current->flags & PF_KTHREAD); /* Disallow cross-task user callchains. */ bool crosstask =3D event->ctx->task && event->ctx->task !=3D current; const u32 max_stack =3D event->attr.sample_max_stack; --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DCB8295528; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; cv=none; b=Lcwa6i4R5uw77bclPMMh++3rB7WaKq14zfN/MrePvjX4fL0p1+5aBBf4H1UaINi2eShLt6BaAJgWGI9pLJxeplwTE1nEdlUC8plE0HVWq8Vwwg/FWSNacTtoDx2ipzEDMCxvVKHkAkxYcLsxYTYUjonUH9AnQxKr+VH17hBYV9M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; c=relaxed/simple; bh=/RBgaoj6pwzNUDRNcaoyijEn+pwQFYxfV1Nt8ojeJMQ=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=f7oGdjgYH+MZQEkEdG0LDEktc7e3vXB6oGISn4mR0NV2aGa08Pec2+3N3WgrIBFhijyamhLT4+bTb+yTYEmo2atIioke3ZL72bnwgG7ABrbAZR6gIpppKNZbC8vsRWs4Gr8rHfoKYr11Ii7KssCeMEvBCXAor5XsD0h5DtD/feI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8C1CC4CEFB; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcD-00000001dU6-4AdL; Wed, 30 Apr 2025 16:01:09 -0400 Message-ID: <20250430200109.845004761@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:58:00 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 14/18] perf: Support deferred user callchains References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Josh Poimboeuf Use the new unwind_deferred_trace() interface (if available) to defer unwinds to task context. This will allow the use of .sframe (when it becomes available) and also prevents duplicate userspace unwinds. Suggested-by: Peter Zijlstra Co-developed-by: Steven Rostedt (Google) Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- Changes since v6: https://lore.kernel.org/20250425145814.033122445@goodmis.= org - Only defer unwind if event is attached to a specific task (not global per= CPU) - Changed a !current->mm to a (current->flags & PF_KTHREAD) - Added a missing rcuwait_init(&event->pending_unwind_wait); arch/Kconfig | 3 + include/linux/perf_event.h | 7 +- include/uapi/linux/perf_event.h | 19 ++- kernel/bpf/stackmap.c | 4 +- kernel/events/callchain.c | 11 +- kernel/events/core.c | 168 +++++++++++++++++++++++++- tools/include/uapi/linux/perf_event.h | 19 ++- 7 files changed, 223 insertions(+), 8 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index dbb1cc89e040..681946b5f2c4 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -446,6 +446,9 @@ config HAVE_UNWIND_USER_COMPAT_FP bool depends on HAVE_UNWIND_USER_FP =20 +config HAVE_PERF_CALLCHAIN_DEFERRED + bool + config HAVE_PERF_REGS bool help diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 3cc0b0ea0afa..10603a8344d3 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -62,6 +62,7 @@ struct perf_guest_info_callbacks { #include #include #include +#include #include =20 struct perf_callchain_entry { @@ -830,6 +831,10 @@ struct perf_event { struct callback_head pending_task; unsigned int pending_work; =20 + unsigned int pending_unwind_callback; + struct callback_head pending_unwind_work; + struct rcuwait pending_unwind_wait; + atomic_t event_limit; =20 /* address range filters */ @@ -1652,7 +1657,7 @@ extern void perf_callchain_user(struct perf_callchain= _entry_ctx *entry, struct p extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, = struct pt_regs *regs); extern struct perf_callchain_entry * get_perf_callchain(struct pt_regs *regs, bool kernel, bool user, - u32 max_stack, bool crosstask, bool add_mark); + u32 max_stack, bool crosstask, bool add_mark, bool defer_user); extern int get_callchain_buffers(int max_stack); extern void put_callchain_buffers(void); extern struct perf_callchain_entry *get_callchain_entry(int *rctx); diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_even= t.h index 5fc753c23734..65fe495c012e 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -462,7 +462,8 @@ struct perf_event_attr { inherit_thread : 1, /* children only inherit if cloned with CLONE_THR= EAD */ remove_on_exec : 1, /* event is removed from task on exec */ sigtrap : 1, /* send synchronous SIGTRAP on event */ - __reserved_1 : 26; + defer_callchain: 1, /* generate PERF_RECORD_CALLCHAIN_DEFERRED record= s */ + __reserved_1 : 25; =20 union { __u32 wakeup_events; /* wakeup every n events */ @@ -1228,6 +1229,21 @@ enum perf_event_type { */ PERF_RECORD_AUX_OUTPUT_HW_ID =3D 21, =20 + /* + * This user callchain capture was deferred until shortly before + * returning to user space. Previous samples would have kernel + * callchains only and they need to be stitched with this to make full + * callchains. + * + * struct { + * struct perf_event_header header; + * u64 nr; + * u64 ips[nr]; + * struct sample_id sample_id; + * }; + */ + PERF_RECORD_CALLCHAIN_DEFERRED =3D 22, + PERF_RECORD_MAX, /* non-ABI */ }; =20 @@ -1258,6 +1274,7 @@ enum perf_callchain_context { PERF_CONTEXT_HV =3D (__u64)-32, PERF_CONTEXT_KERNEL =3D (__u64)-128, PERF_CONTEXT_USER =3D (__u64)-512, + PERF_CONTEXT_USER_DEFERRED =3D (__u64)-640, =20 PERF_CONTEXT_GUEST =3D (__u64)-2048, PERF_CONTEXT_GUEST_KERNEL =3D (__u64)-2176, diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index ec3a57a5fba1..339f7cbbcf36 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -315,7 +315,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, str= uct bpf_map *, map, max_depth =3D sysctl_perf_event_max_stack; =20 trace =3D get_perf_callchain(regs, kernel, user, max_depth, - false, false); + false, false, false); =20 if (unlikely(!trace)) /* couldn't fetch the stack trace */ @@ -452,7 +452,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struc= t task_struct *task, trace =3D get_callchain_entry_for_task(task, max_depth); else trace =3D get_perf_callchain(regs, kernel, user, max_depth, - crosstask, false); + crosstask, false, false); =20 if (unlikely(!trace) || trace->nr < skip) { if (may_fault) diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c index 2798c0c9f782..50c637e960b9 100644 --- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -218,7 +218,7 @@ static void fixup_uretprobe_trampoline_entries(struct p= erf_callchain_entry *entr =20 struct perf_callchain_entry * get_perf_callchain(struct pt_regs *regs, bool kernel, bool user, - u32 max_stack, bool crosstask, bool add_mark) + u32 max_stack, bool crosstask, bool add_mark, bool defer_user) { struct perf_callchain_entry *entry; struct perf_callchain_entry_ctx ctx; @@ -251,6 +251,15 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, = bool user, regs =3D task_pt_regs(current); } =20 + if (defer_user) { + /* + * Foretell the coming of PERF_RECORD_CALLCHAIN_DEFERRED + * which can be stitched to this one. + */ + perf_callchain_store_context(&ctx, PERF_CONTEXT_USER_DEFERRED); + goto exit_put; + } + if (add_mark) perf_callchain_store_context(&ctx, PERF_CONTEXT_USER); =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index ec9edf602974..a5d9c6220589 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5537,6 +5537,89 @@ static bool exclusive_event_installable(struct perf_= event *event, return true; } =20 +static void perf_pending_unwind_sync(struct perf_event *event) +{ + might_sleep(); + + if (!event->pending_unwind_callback) + return; + + /* + * If the task is queued to the current task's queue, we + * obviously can't wait for it to complete. Simply cancel it. + */ + if (task_work_cancel(current, &event->pending_unwind_work)) { + event->pending_unwind_callback =3D 0; + local_dec(&event->ctx->nr_no_switch_fast); + return; + } + + /* + * All accesses related to the event are within the same RCU section in + * perf_event_callchain_deferred(). The RCU grace period before the + * event is freed will make sure all those accesses are complete by then. + */ + rcuwait_wait_event(&event->pending_unwind_wait, !event->pending_unwind_ca= llback, TASK_UNINTERRUPTIBLE); +} + +struct perf_callchain_deferred_event { + struct perf_event_header header; + u64 nr; + u64 ips[]; +}; + +static void perf_event_callchain_deferred(struct callback_head *work) +{ + struct perf_event *event =3D container_of(work, struct perf_event, pendin= g_unwind_work); + struct perf_callchain_deferred_event deferred_event; + u64 callchain_context =3D PERF_CONTEXT_USER; + struct unwind_stacktrace trace; + struct perf_output_handle handle; + struct perf_sample_data data; + u64 nr; + + if (!event->pending_unwind_callback) + return; + + if (unwind_deferred_trace(&trace) < 0) + goto out; + + /* + * All accesses to the event must belong to the same implicit RCU + * read-side critical section as the ->pending_unwind_callback reset. + * See comment in perf_pending_unwind_sync(). + */ + guard(rcu)(); + + if (current->flags & PF_KTHREAD) + goto out; + + nr =3D trace.nr + 1 ; /* '+1' =3D=3D callchain_context */ + + deferred_event.header.type =3D PERF_RECORD_CALLCHAIN_DEFERRED; + deferred_event.header.misc =3D PERF_RECORD_MISC_USER; + deferred_event.header.size =3D sizeof(deferred_event) + (nr * sizeof(u64)= ); + + deferred_event.nr =3D nr; + + perf_event_header__init_id(&deferred_event.header, &data, event); + + if (perf_output_begin(&handle, &data, event, deferred_event.header.size)) + goto out; + + perf_output_put(&handle, deferred_event); + perf_output_put(&handle, callchain_context); + perf_output_copy(&handle, trace.entries, trace.nr * sizeof(u64)); + perf_event__output_id_sample(event, &handle, &data); + + perf_output_end(&handle); + +out: + event->pending_unwind_callback =3D 0; + local_dec(&event->ctx->nr_no_switch_fast); + rcuwait_wake_up(&event->pending_unwind_wait); +} + static void perf_free_addr_filters(struct perf_event *event); =20 /* vs perf_event_alloc() error */ @@ -5604,6 +5687,7 @@ static void _free_event(struct perf_event *event) { irq_work_sync(&event->pending_irq); irq_work_sync(&event->pending_disable_irq); + perf_pending_unwind_sync(event); =20 unaccount_event(event); =20 @@ -8097,6 +8181,65 @@ static u64 perf_get_page_size(unsigned long addr) =20 static struct perf_callchain_entry __empty_callchain =3D { .nr =3D 0, }; =20 +/* Returns the same as deferred_request() below */ +static int deferred_request_nmi(struct perf_event *event) +{ + struct callback_head *work =3D &event->pending_unwind_work; + int ret; + + if (event->pending_unwind_callback) + return 1; + + ret =3D task_work_add(current, work, TWA_NMI_CURRENT); + if (ret) + return ret; + + event->pending_unwind_callback =3D 1; + return 0; +} + +/* + * Returns: +* > 0 : if already queued. + * 0 : if it performed the queuing + * < 0 : if it did not get queued. + */ +static int deferred_request(struct perf_event *event) +{ + struct callback_head *work =3D &event->pending_unwind_work; + int pending; + int ret; + + /* Only defer for task events */ + if (!event->ctx->task) + return -EINVAL; + + if ((current->flags & PF_KTHREAD) || !user_mode(task_pt_regs(current))) + return -EINVAL; + + if (in_nmi()) + return deferred_request_nmi(event); + + guard(irqsave)(); + + /* callback already pending? */ + pending =3D READ_ONCE(event->pending_unwind_callback); + if (pending) + return 1; + + /* Claim the work unless an NMI just now swooped in to do so. */ + if (!try_cmpxchg(&event->pending_unwind_callback, &pending, 1)) + return 1; + + /* The work has been claimed, now schedule it. */ + ret =3D task_work_add(current, work, TWA_RESUME); + if (WARN_ON_ONCE(ret)) { + WRITE_ONCE(event->pending_unwind_callback, 0); + return ret; + } + return 0; +} + struct perf_callchain_entry * perf_callchain(struct perf_event *event, struct pt_regs *regs) { @@ -8107,12 +8250,27 @@ perf_callchain(struct perf_event *event, struct pt_= regs *regs) bool crosstask =3D event->ctx->task && event->ctx->task !=3D current; const u32 max_stack =3D event->attr.sample_max_stack; struct perf_callchain_entry *callchain; + bool defer_user =3D IS_ENABLED(CONFIG_UNWIND_USER) && user && + event->attr.defer_callchain; =20 if (!kernel && !user) return &__empty_callchain; =20 - callchain =3D get_perf_callchain(regs, kernel, user, - max_stack, crosstask, true); + /* Disallow cross-task callchains. */ + if (event->ctx->task && event->ctx->task !=3D current) + return &__empty_callchain; + + if (defer_user) { + int ret =3D deferred_request(event); + if (!ret) + local_inc(&event->ctx->nr_no_switch_fast); + else if (ret < 0) + defer_user =3D false; + } + + callchain =3D get_perf_callchain(regs, kernel, user, max_stack, + crosstask, true, defer_user); + return callchain ?: &__empty_callchain; } =20 @@ -12776,6 +12934,8 @@ perf_event_alloc(struct perf_event_attr *attr, int = cpu, event->pending_disable_irq =3D IRQ_WORK_INIT_HARD(perf_pending_disable); init_task_work(&event->pending_task, perf_pending_task); =20 + rcuwait_init(&event->pending_unwind_wait); + mutex_init(&event->mmap_mutex); raw_spin_lock_init(&event->addr_filters.lock); =20 @@ -12944,6 +13104,10 @@ perf_event_alloc(struct perf_event_attr *attr, int= cpu, if (err) return ERR_PTR(err); =20 + if (event->attr.defer_callchain) + init_task_work(&event->pending_unwind_work, + perf_event_callchain_deferred); + /* symmetric to unaccount_event() in _free_event() */ account_event(event); =20 diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/lin= ux/perf_event.h index 5fc753c23734..65fe495c012e 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -462,7 +462,8 @@ struct perf_event_attr { inherit_thread : 1, /* children only inherit if cloned with CLONE_THR= EAD */ remove_on_exec : 1, /* event is removed from task on exec */ sigtrap : 1, /* send synchronous SIGTRAP on event */ - __reserved_1 : 26; + defer_callchain: 1, /* generate PERF_RECORD_CALLCHAIN_DEFERRED record= s */ + __reserved_1 : 25; =20 union { __u32 wakeup_events; /* wakeup every n events */ @@ -1228,6 +1229,21 @@ enum perf_event_type { */ PERF_RECORD_AUX_OUTPUT_HW_ID =3D 21, =20 + /* + * This user callchain capture was deferred until shortly before + * returning to user space. Previous samples would have kernel + * callchains only and they need to be stitched with this to make full + * callchains. + * + * struct { + * struct perf_event_header header; + * u64 nr; + * u64 ips[nr]; + * struct sample_id sample_id; + * }; + */ + PERF_RECORD_CALLCHAIN_DEFERRED =3D 22, + PERF_RECORD_MAX, /* non-ABI */ }; =20 @@ -1258,6 +1274,7 @@ enum perf_callchain_context { PERF_CONTEXT_HV =3D (__u64)-32, PERF_CONTEXT_KERNEL =3D (__u64)-128, PERF_CONTEXT_USER =3D (__u64)-512, + PERF_CONTEXT_USER_DEFERRED =3D (__u64)-640, =20 PERF_CONTEXT_GUEST =3D (__u64)-2048, PERF_CONTEXT_GUEST_KERNEL =3D (__u64)-2176, --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B4BF296D2E; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; cv=none; b=szw6nwZbnVyvvbSRHNTM9Kr7e4HXQw2AFFawNKM9TGN3m4tyCvUsMyt9UsxoKcB9pA1L0X/4/8B8kPQbpRCg1uFf75OC0QTB2HIDH8Xu0Zw+pA9EaoW2HJ316d4gJPJJbB1mR1V+87JK8c5k637lPjUTpyDdNfR8BfDEAJsImUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; c=relaxed/simple; bh=jzN6CV2MqgOb/d1oFesv9PXnCLgcPzPLY+/5mDw+2uY=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=mx+0puByt1OxsWih8z4t6AA/8FNfkyxnnm869EiG7g8DnMFoG4cAJYX/P63ycrWiSw0zouW6HCZmfj0UTXUxb28tN8lzJE8TxxOH0duZpDB2aFsbZtxbauY4LYYgbhfAfNAhKb8kdtNhVHHRWkjTiKpvXWMySirNtEoyTT4d5uE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id D0D42C4AF0B; Wed, 30 Apr 2025 20:01:05 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcE-00000001dUc-0g5j; Wed, 30 Apr 2025 16:01:10 -0400 Message-ID: <20250430200110.014870250@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:58:01 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 15/18] perf tools: Minimal CALLCHAIN_DEFERRED support References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Namhyung Kim Add a new event type for deferred callchains and a new callback for the struct perf_tool. For now it doesn't actually handle the deferred callchains but it just marks the sample if it has the PERF_CONTEXT_ USER_DEFFERED in the callchain array. At least, perf report can dump the raw data with this change. Actually this requires the next commit to enable attr.defer_callchain, but if you already have a data file, it'll show the following result. $ perf report -D ... 0x5fe0@perf.data [0x40]: event: 22 . . ... raw event: size 64 bytes . 0000: 16 00 00 00 02 00 40 00 02 00 00 00 00 00 00 00 ......@.......= .. . 0010: 00 fe ff ff ff ff ff ff 4b d3 3f 25 45 7f 00 00 ........K.?%E.= .. . 0020: 21 03 00 00 21 03 00 00 43 02 12 ab 05 00 00 00 !...!...C.....= .. . 0030: 00 00 00 00 00 00 00 00 09 00 00 00 00 00 00 00 ..............= .. 0 24344920643 0x5fe0 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 801= /801: 0 ... FP chain: nr:2 ..... 0: fffffffffffffe00 ..... 1: 00007f45253fd34b : unhandled! Signed-off-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- tools/lib/perf/include/perf/event.h | 7 +++++++ tools/perf/util/event.c | 1 + tools/perf/util/evsel.c | 15 +++++++++++++++ tools/perf/util/machine.c | 1 + tools/perf/util/perf_event_attr_fprintf.c | 1 + tools/perf/util/sample.h | 3 ++- tools/perf/util/session.c | 17 +++++++++++++++++ tools/perf/util/tool.c | 1 + tools/perf/util/tool.h | 3 ++- 9 files changed, 47 insertions(+), 2 deletions(-) diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/p= erf/event.h index 37bb7771d914..f643a6a2b9fc 100644 --- a/tools/lib/perf/include/perf/event.h +++ b/tools/lib/perf/include/perf/event.h @@ -151,6 +151,12 @@ struct perf_record_switch { __u32 next_prev_tid; }; =20 +struct perf_record_callchain_deferred { + struct perf_event_header header; + __u64 nr; + __u64 ips[]; +}; + struct perf_record_header_attr { struct perf_event_header header; struct perf_event_attr attr; @@ -494,6 +500,7 @@ union perf_event { struct perf_record_read read; struct perf_record_throttle throttle; struct perf_record_sample sample; + struct perf_record_callchain_deferred callchain_deferred; struct perf_record_bpf_event bpf; struct perf_record_ksymbol ksymbol; struct perf_record_text_poke_event text_poke; diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index c23b77f8f854..fec86519b7d4 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -58,6 +58,7 @@ static const char *perf_event__names[] =3D { [PERF_RECORD_CGROUP] =3D "CGROUP", [PERF_RECORD_TEXT_POKE] =3D "TEXT_POKE", [PERF_RECORD_AUX_OUTPUT_HW_ID] =3D "AUX_OUTPUT_HW_ID", + [PERF_RECORD_CALLCHAIN_DEFERRED] =3D "CALLCHAIN_DEFERRED", [PERF_RECORD_HEADER_ATTR] =3D "ATTR", [PERF_RECORD_HEADER_EVENT_TYPE] =3D "EVENT_TYPE", [PERF_RECORD_HEADER_TRACING_DATA] =3D "TRACING_DATA", diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 3c030da2e477..b872236a2413 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -2948,6 +2948,18 @@ int evsel__parse_sample(struct evsel *evsel, union p= erf_event *event, data->data_src =3D PERF_MEM_DATA_SRC_NONE; data->vcpu =3D -1; =20 + if (event->header.type =3D=3D PERF_RECORD_CALLCHAIN_DEFERRED) { + const u64 max_callchain_nr =3D UINT64_MAX / sizeof(u64); + + data->callchain =3D (struct ip_callchain *)&event->callchain_deferred.nr; + if (data->callchain->nr > max_callchain_nr) + return -EFAULT; + + if (evsel->core.attr.sample_id_all) + perf_evsel__parse_id_sample(evsel, event, data); + return 0; + } + if (event->header.type !=3D PERF_RECORD_SAMPLE) { if (!evsel->core.attr.sample_id_all) return 0; @@ -3078,6 +3090,9 @@ int evsel__parse_sample(struct evsel *evsel, union pe= rf_event *event, if (data->callchain->nr > max_callchain_nr) return -EFAULT; sz =3D data->callchain->nr * sizeof(u64); + if (evsel->core.attr.defer_callchain && data->callchain->nr >=3D 1 && + data->callchain->ips[data->callchain->nr - 1] =3D=3D PERF_CONTEXT_US= ER_DEFERRED) + data->deferred_callchain =3D true; OVERFLOW_CHECK(array, sz, max_size); array =3D (void *)array + sz; } diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 2531b373f2cf..df76adce89ff 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2089,6 +2089,7 @@ static int add_callchain_ip(struct thread *thread, *cpumode =3D PERF_RECORD_MISC_KERNEL; break; case PERF_CONTEXT_USER: + case PERF_CONTEXT_USER_DEFERRED: *cpumode =3D PERF_RECORD_MISC_USER; break; default: diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/pe= rf_event_attr_fprintf.c index 66b666d9ce64..abfd9b9a718c 100644 --- a/tools/perf/util/perf_event_attr_fprintf.c +++ b/tools/perf/util/perf_event_attr_fprintf.c @@ -343,6 +343,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_even= t_attr *attr, PRINT_ATTRf(inherit_thread, p_unsigned); PRINT_ATTRf(remove_on_exec, p_unsigned); PRINT_ATTRf(sigtrap, p_unsigned); + PRINT_ATTRf(defer_callchain, p_unsigned); =20 PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsig= ned, false); PRINT_ATTRf(bp_type, p_unsigned); diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h index 0e96240052e9..9d6e2f14551c 100644 --- a/tools/perf/util/sample.h +++ b/tools/perf/util/sample.h @@ -108,7 +108,8 @@ struct perf_sample { u16 p_stage_cyc; u16 retire_lat; }; - bool no_hw_idx; /* No hw_idx collected in branch_stack */ + bool no_hw_idx; /* No hw_idx collected in branch_stack */ + bool deferred_callchain; /* Has deferred user callchains */ char insn[MAX_INSN]; void *raw_data; struct ip_callchain *callchain; diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 60fb9997ea0d..30fb1d281be8 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -715,6 +715,7 @@ static perf_event__swap_op perf_event__swap_ops[] =3D { [PERF_RECORD_CGROUP] =3D perf_event__cgroup_swap, [PERF_RECORD_TEXT_POKE] =3D perf_event__text_poke_swap, [PERF_RECORD_AUX_OUTPUT_HW_ID] =3D perf_event__all64_swap, + [PERF_RECORD_CALLCHAIN_DEFERRED] =3D perf_event__all64_swap, [PERF_RECORD_HEADER_ATTR] =3D perf_event__hdr_attr_swap, [PERF_RECORD_HEADER_EVENT_TYPE] =3D perf_event__event_type_swap, [PERF_RECORD_HEADER_TRACING_DATA] =3D perf_event__tracing_data_swap, @@ -1118,6 +1119,19 @@ static void dump_sample(struct evsel *evsel, union p= erf_event *event, sample_read__printf(sample, evsel->core.attr.read_format); } =20 +static void dump_deferred_callchain(struct evsel *evsel, union perf_event = *event, + struct perf_sample *sample) +{ + if (!dump_trace) + return; + + printf("(IP, 0x%x): %d/%d: %#" PRIx64 "\n", + event->header.misc, sample->pid, sample->tid, sample->ip); + + if (evsel__has_callchain(evsel)) + callchain__printf(evsel, sample); +} + static void dump_read(struct evsel *evsel, union perf_event *event) { struct perf_record_read *read_event =3D &event->read; @@ -1348,6 +1362,9 @@ static int machines__deliver_event(struct machines *m= achines, return tool->text_poke(tool, event, sample, machine); case PERF_RECORD_AUX_OUTPUT_HW_ID: return tool->aux_output_hw_id(tool, event, sample, machine); + case PERF_RECORD_CALLCHAIN_DEFERRED: + dump_deferred_callchain(evsel, event, sample); + return tool->callchain_deferred(tool, event, sample, evsel, machine); default: ++evlist->stats.nr_unknown_events; return -1; diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c index 3b7f390f26eb..e78f16de912e 100644 --- a/tools/perf/util/tool.c +++ b/tools/perf/util/tool.c @@ -259,6 +259,7 @@ void perf_tool__init(struct perf_tool *tool, bool order= ed_events) tool->read =3D process_event_sample_stub; tool->throttle =3D process_event_stub; tool->unthrottle =3D process_event_stub; + tool->callchain_deferred =3D process_event_sample_stub; tool->attr =3D process_event_synth_attr_stub; tool->event_update =3D process_event_synth_event_update_stub; tool->tracing_data =3D process_event_synth_tracing_data_stub; diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h index db1c7642b0d1..9987bbde6d5e 100644 --- a/tools/perf/util/tool.h +++ b/tools/perf/util/tool.h @@ -42,7 +42,8 @@ enum show_feature_header { =20 struct perf_tool { event_sample sample, - read; + read, + callchain_deferred; event_op mmap, mmap2, comm, --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3ED7C29553A; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; cv=none; b=qt9I+EvUInaHqkHFmirjUlb3g2hJ331579CtGVN6Faf767AYC30MzvNIqvH1Cy9BQWgrhXLq2MfZm6ITr5Ybzq/NbyXDcqSykiix9d0Knd5ZGM0+UUtkvOjCUgq48xMmIWS/XhWjlwlGrnnM+VBD3VhzLaTdPzhehv4OuA6pT3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; c=relaxed/simple; bh=VcxOh3mOnIXjHWm2qqnllSh0bi7tu4jE843+0n9nFa4=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=ae1/d+DZcbvbhI3SDrrXSfxg20045AJ1ab7tGB0zZi0vtk6NhZqvIPcfoQ7ZXPOLmWRe/rXd8+AZEb7gDmlmQcjLwU5K46qR/ond1zW//2ml5fJZ+xVSCmE02O7ko9jJ5ApV+1nbaerng8WeBC07qPOqhyQ8qM13gNg2dGqqsPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09658C4AF19; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcE-00000001dV7-1P96; Wed, 30 Apr 2025 16:01:10 -0400 Message-ID: <20250430200110.182678717@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:58:02 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 16/18] perf record: Enable defer_callchain for user callchains References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Namhyung Kim And add the missing feature detection logic to clear the flag on old kernels. $ perf record -g -vv true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CALLCHAIN|PERIOD read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 defer_callchain 1 ------------------------------------------------------------ sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Signed-off-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++ tools/perf/util/evsel.h | 1 + 2 files changed, 25 insertions(+) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index b872236a2413..669e585dedee 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1076,6 +1076,14 @@ static void __evsel__config_callchain(struct evsel *= evsel, struct record_opts *o } } =20 + if (param->record_mode =3D=3D CALLCHAIN_FP && !attr->exclude_callchain_us= er) { + /* + * Enable deferred callchains optimistically. It'll be switched + * off later if the kernel doesn't support it. + */ + attr->defer_callchain =3D 1; + } + if (function) { pr_info("Disabling user space callchains for function trace event.\n"); attr->exclude_callchain_user =3D 1; @@ -2123,6 +2131,8 @@ static int __evsel__prepare_open(struct evsel *evsel,= struct perf_cpu_map *cpus, =20 static void evsel__disable_missing_features(struct evsel *evsel) { + if (perf_missing_features.defer_callchain) + evsel->core.attr.defer_callchain =3D 0; if (perf_missing_features.inherit_sample_read && evsel->core.attr.inherit= && (evsel->core.attr.sample_type & PERF_SAMPLE_READ)) evsel->core.attr.inherit =3D 0; @@ -2397,6 +2407,15 @@ static bool evsel__detect_missing_features(struct ev= sel *evsel, struct perf_cpu =20 /* Please add new feature detection here. */ =20 + attr.defer_callchain =3D true; + attr.sample_type =3D PERF_SAMPLE_CALLCHAIN; + if (has_attr_feature(&attr, /*flags=3D*/0)) + goto found; + perf_missing_features.defer_callchain =3D true; + pr_debug2("switching off deferred callchain support\n"); + attr.defer_callchain =3D false; + attr.sample_type =3D 0; + attr.inherit =3D true; attr.sample_type =3D PERF_SAMPLE_READ; if (has_attr_feature(&attr, /*flags=3D*/0)) @@ -2508,6 +2527,11 @@ static bool evsel__detect_missing_features(struct ev= sel *evsel, struct perf_cpu errno =3D old_errno; =20 check: + if (evsel->core.attr.defer_callchain && + evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN && + perf_missing_features.defer_callchain) + return true; + if (evsel->core.attr.inherit && (evsel->core.attr.sample_type & PERF_SAMPLE_READ) && perf_missing_features.inherit_sample_read) diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index aae431d63d64..7ded99c774c7 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -211,6 +211,7 @@ struct perf_missing_features { bool branch_counters; bool aux_action; bool inherit_sample_read; + bool defer_callchain; }; =20 extern struct perf_missing_features perf_missing_features; --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74743296168; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; cv=none; b=fXe4wFhIHLuDQhDAFqXjGFn4Zo989AIp1O+OvxHEVssJHMt32zSnvdfWpdyIuFopbyTrQQugUVdfcn6JSXuxHWYv8rVYbGu1CDBDMR8GGiSCa2D4nDY3yXgYvG//5Wj93L+jOZu0MdjyR9AE4WoCTYn4zl4kFDQl0UyMIlajoaw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; c=relaxed/simple; bh=o37E9WkrzZ0AInylfmwBbO2mO7Zp/n1r4ftV65Aub44=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=VlcYp6E0CbhwVl4dFGDfy1Lr5TPPSAhWerm09DPSCM28F257XEO30LZF9HF6/GwdxyIPw6qzx76ivekB7w5o16EA1MNalz3QGfB0+pArWZTdzCFwqvY7CZ2Du06UPTJzlkzU5REBFrSrBgaGueJSqLrSCbqyMaw0J1qWAH6gzGk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4C334C4CEFD; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcE-00000001dVj-2821; Wed, 30 Apr 2025 16:01:10 -0400 Message-ID: <20250430200110.358875813@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:58:03 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 17/18] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Namhyung Kim Handle the deferred callchains in the script output. $ perf script perf 801 [000] 18.031793: 1 cycles:P: ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kall= syms]) ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms= ]) ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) perf 801 [000] 18.031814: DEFERRED CALLCHAIN 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/lib= c.so.6) Signed-off-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- tools/perf/builtin-script.c | 89 +++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 9b16df881af8..176b8f299afc 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2483,6 +2483,93 @@ static int process_sample_event(const struct perf_to= ol *tool, return ret; } =20 +static int process_deferred_sample_event(const struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample, + struct evsel *evsel, + struct machine *machine) +{ + struct perf_script *scr =3D container_of(tool, struct perf_script, tool); + struct perf_event_attr *attr =3D &evsel->core.attr; + struct evsel_script *es =3D evsel->priv; + unsigned int type =3D output_type(attr->type); + struct addr_location al; + FILE *fp =3D es->fp; + int ret =3D 0; + + if (output[type].fields =3D=3D 0) + return 0; + + /* Set thread to NULL to indicate addr_al and al are not initialized */ + addr_location__init(&al); + + if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num, + sample->time)) { + goto out_put; + } + + if (debug_mode) { + if (sample->time < last_timestamp) { + pr_err("Samples misordered, previous: %" PRIu64 + " this: %" PRIu64 "\n", last_timestamp, + sample->time); + nr_unordered++; + } + last_timestamp =3D sample->time; + goto out_put; + } + + if (filter_cpu(sample)) + goto out_put; + + if (machine__resolve(machine, &al, sample) < 0) { + pr_err("problem processing %d event, skipping it.\n", + event->header.type); + ret =3D -1; + goto out_put; + } + + if (al.filtered) + goto out_put; + + if (!show_event(sample, evsel, al.thread, &al, NULL)) + goto out_put; + + if (evswitch__discard(&scr->evswitch, evsel)) + goto out_put; + + perf_sample__fprintf_start(scr, sample, al.thread, evsel, + PERF_RECORD_CALLCHAIN_DEFERRED, fp); + fprintf(fp, "DEFERRED CALLCHAIN"); + + if (PRINT_FIELD(IP)) { + struct callchain_cursor *cursor =3D NULL; + + if (symbol_conf.use_callchain && sample->callchain) { + cursor =3D get_tls_callchain_cursor(); + if (thread__resolve_callchain(al.thread, cursor, evsel, + sample, NULL, NULL, + scripting_max_stack)) { + pr_info("cannot resolve deferred callchains\n"); + cursor =3D NULL; + } + } + + fputc(cursor ? '\n' : ' ', fp); + sample__fprintf_sym(sample, &al, 0, output[type].print_ip_opts, + cursor, symbol_conf.bt_stop_list, fp); + } + + fprintf(fp, "\n"); + + if (verbose > 0) + fflush(fp); + +out_put: + addr_location__exit(&al); + return ret; +} + // Used when scr->per_event_dump is not set static struct evsel_script es_stdout; =20 @@ -4069,6 +4156,7 @@ int cmd_script(int argc, const char **argv) =20 perf_tool__init(&script.tool, !unsorted_dump); script.tool.sample =3D process_sample_event; + script.tool.callchain_deferred =3D process_deferred_sample_event; script.tool.mmap =3D perf_event__process_mmap; script.tool.mmap2 =3D perf_event__process_mmap2; script.tool.comm =3D perf_event__process_comm; @@ -4095,6 +4183,7 @@ int cmd_script(int argc, const char **argv) script.tool.throttle =3D process_throttle_event; script.tool.unthrottle =3D process_throttle_event; script.tool.ordering_requires_timestamps =3D true; + script.tool.merge_deferred_callchains =3D false; session =3D perf_session__new(&data, &script.tool); if (IS_ERR(session)) return PTR_ERR(session); --=20 2.47.2 From nobody Fri Dec 19 07:48:02 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A906A296FA1; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; cv=none; b=lbdQLN//IiNbP8HPQjcKkACNJ5pOXIF6mVQGqcyzszGd++is6d45DRhcaUQC8aebGRjuvKNdKPHRXS1Rr1jcLxj5AFOdJ4oeCjClHiqwFZ7KGGHj8L7nD99RpAq7v+bRypG87XlgY5g+F+AZWFub5aAvzSBr9gT+xoRZHW4pT3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746043266; c=relaxed/simple; bh=/zRizgZFZlI5K0XKGv1I5paDFnaMYmUlybJB51qAZDg=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=hikyQvpo0GKP8KE73g1GlckSI12JvbWe4Wq6pTXxE0QTXYqPFb0hE3lK8oX7NKHSVplUAzm2g3qDuUJX7cBfms/v0WNeo6R3qcsB9epYt4hGDTQarrLWB/IAzUVOVr8DkwgUCsqJ39GD4lgO03FeRts1ETXCLKtCK0p87AG4B1g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6DBF9C4CEFB; Wed, 30 Apr 2025 20:01:06 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.98.2) (envelope-from ) id 1uADcE-00000001dWE-2qNu; Wed, 30 Apr 2025 16:01:10 -0400 Message-ID: <20250430200110.528523011@goodmis.org> User-Agent: quilt/0.68 Date: Wed, 30 Apr 2025 15:58:04 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Josh Poimboeuf , x86@kernel.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Indu Bhagat , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, Mark Brown , linux-toolchains@vger.kernel.org, Jordan Rome , Sam James , Andrii Nakryiko , Jens Remus , Florian Weimer , Andy Lutomirski , Weinan Liu , Blake Jones , Beau Belgrave , "Jose E. Marchesi" , Alexander Aring Subject: [PATCH v7 18/18] perf tools: Merge deferred user callchains References: <20250430195746.827125963@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Namhyung Kim Save samples with deferred callchains in a separate list and deliver them after merging the user callchains. If users don't want to merge they can set tool->merge_deferred_callchains to false to prevent the behavior. With previous result, now perf script will show the merged callchains. $ perf script perf 801 [000] 18.031793: 1 cycles:P: ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kall= syms]) ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms= ]) ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/lib= c.so.6) ... The old output can be get using --no-merge-callchain option. Also perf report can get the user callchain entry at the end. $ perf report --no-children --percent-limit=3D0 --stdio -q -S __intel_pmu= _enable_all.isra.0 # symbol: __intel_pmu_enable_all.isra.0 0.00% perf [kernel.kallsyms] | ---__intel_pmu_enable_all.isra.0 perf_ctx_enable event_function remote_function generic_exec_single smp_call_function_single event_function_call perf_event_for_each_child _perf_ioctl perf_ioctl __x64_sys_ioctl do_syscall_64 entry_SYSCALL_64 __GI___ioctl Signed-off-by: Namhyung Kim Signed-off-by: Josh Poimboeuf Signed-off-by: Steven Rostedt (Google) --- tools/perf/Documentation/perf-script.txt | 5 ++ tools/perf/builtin-script.c | 5 +- tools/perf/util/callchain.c | 24 +++++++++ tools/perf/util/callchain.h | 3 ++ tools/perf/util/evlist.c | 1 + tools/perf/util/evlist.h | 1 + tools/perf/util/session.c | 63 +++++++++++++++++++++++- tools/perf/util/tool.c | 1 + tools/perf/util/tool.h | 1 + 9 files changed, 102 insertions(+), 2 deletions(-) diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Document= ation/perf-script.txt index 28bec7e78bc8..03d112960632 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -527,6 +527,11 @@ include::itrace.txt[] The known limitations include exception handing such as setjmp/longjmp will have calls/returns not match. =20 +--merge-callchains:: + Enable merging deferred user callchains if available. This is the + default behavior. If you want to see separate CALLCHAIN_DEFERRED + records for some reason, use --no-merge-callchains explicitly. + :GMEXAMPLECMD: script :GMEXAMPLESUBCMD: include::guest-files.txt[] diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 176b8f299afc..dd17c11af0c8 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -3775,6 +3775,7 @@ int cmd_script(int argc, const char **argv) bool header_only =3D false; bool script_started =3D false; bool unsorted_dump =3D false; + bool merge_deferred_callchains =3D true; char *rec_script_path =3D NULL; char *rep_script_path =3D NULL; struct perf_session *session; @@ -3928,6 +3929,8 @@ int cmd_script(int argc, const char **argv) "Guest code can be found in hypervisor process"), OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr, "Enable LBR callgraph stitching approach"), + OPT_BOOLEAN('\0', "merge-callchains", &merge_deferred_callchains, + "Enable merge deferred user callchains"), OPTS_EVSWITCH(&script.evswitch), OPT_END() }; @@ -4183,7 +4186,7 @@ int cmd_script(int argc, const char **argv) script.tool.throttle =3D process_throttle_event; script.tool.unthrottle =3D process_throttle_event; script.tool.ordering_requires_timestamps =3D true; - script.tool.merge_deferred_callchains =3D false; + script.tool.merge_deferred_callchains =3D merge_deferred_callchains; session =3D perf_session__new(&data, &script.tool); if (IS_ERR(session)) return PTR_ERR(session); diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index d7b7eef740b9..6d423d92861b 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -1828,3 +1828,27 @@ int sample__for_each_callchain_node(struct thread *t= hread, struct evsel *evsel, } return 0; } + +int sample__merge_deferred_callchain(struct perf_sample *sample_orig, + struct perf_sample *sample_callchain) +{ + u64 nr_orig =3D sample_orig->callchain->nr - 1; + u64 nr_deferred =3D sample_callchain->callchain->nr; + struct ip_callchain *callchain; + + callchain =3D calloc(1 + nr_orig + nr_deferred, sizeof(u64)); + if (callchain =3D=3D NULL) { + sample_orig->deferred_callchain =3D false; + return -ENOMEM; + } + + callchain->nr =3D nr_orig + nr_deferred; + /* copy except for the last PERF_CONTEXT_USER_DEFERRED */ + memcpy(callchain->ips, sample_orig->callchain->ips, nr_orig * sizeof(u64)= ); + /* copy deferred use callchains */ + memcpy(&callchain->ips[nr_orig], sample_callchain->callchain->ips, + nr_deferred * sizeof(u64)); + + sample_orig->callchain =3D callchain; + return 0; +} diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index 86ed9e4d04f9..89785125ed25 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -317,4 +317,7 @@ int sample__for_each_callchain_node(struct thread *thre= ad, struct evsel *evsel, struct perf_sample *sample, int max_stack, bool symbols, callchain_iter_fn cb, void *data); =20 +int sample__merge_deferred_callchain(struct perf_sample *sample_orig, + struct perf_sample *sample_callchain); + #endif /* __PERF_CALLCHAIN_H */ diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index c1a04141aed0..d23a3f8e8649 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -82,6 +82,7 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_= map *cpus, evlist->ctl_fd.ack =3D -1; evlist->ctl_fd.pos =3D -1; evlist->nr_br_cntr =3D -1; + INIT_LIST_HEAD(&evlist->deferred_samples); } =20 struct evlist *evlist__new(void) diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index edcbf1c10e92..a8cb5a29d55e 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -84,6 +84,7 @@ struct evlist { int pos; /* index at evlist core object to check signals */ } ctl_fd; struct event_enable_timer *eet; + struct list_head deferred_samples; }; =20 struct evsel_str_handler { diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 30fb1d281be8..51f17bf42dd9 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1277,6 +1277,56 @@ static int evlist__deliver_sample(struct evlist *evl= ist, const struct perf_tool per_thread); } =20 +struct deferred_event { + struct list_head list; + union perf_event *event; +}; + +static int evlist__deliver_deferred_samples(struct evlist *evlist, + const struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample, + struct machine *machine) +{ + struct deferred_event *de, *tmp; + struct evsel *evsel; + int ret =3D 0; + + if (!tool->merge_deferred_callchains) { + evsel =3D evlist__id2evsel(evlist, sample->id); + return tool->callchain_deferred(tool, event, sample, + evsel, machine); + } + + list_for_each_entry_safe(de, tmp, &evlist->deferred_samples, list) { + struct perf_sample orig_sample; + + ret =3D evlist__parse_sample(evlist, de->event, &orig_sample); + if (ret < 0) { + pr_err("failed to parse original sample\n"); + break; + } + + if (sample->tid !=3D orig_sample.tid) + continue; + + evsel =3D evlist__id2evsel(evlist, orig_sample.id); + sample__merge_deferred_callchain(&orig_sample, sample); + ret =3D evlist__deliver_sample(evlist, tool, de->event, + &orig_sample, evsel, machine); + + if (orig_sample.deferred_callchain) + free(orig_sample.callchain); + + list_del(&de->list); + free(de); + + if (ret) + break; + } + return ret; +} + static int machines__deliver_event(struct machines *machines, struct evlist *evlist, union perf_event *event, @@ -1305,6 +1355,16 @@ static int machines__deliver_event(struct machines *= machines, return 0; } dump_sample(evsel, event, sample, perf_env__arch(machine->env)); + if (sample->deferred_callchain && tool->merge_deferred_callchains) { + struct deferred_event *de =3D malloc(sizeof(*de)); + + if (de =3D=3D NULL) + return -ENOMEM; + + de->event =3D event; + list_add_tail(&de->list, &evlist->deferred_samples); + return 0; + } return evlist__deliver_sample(evlist, tool, event, sample, evsel, machin= e); case PERF_RECORD_MMAP: return tool->mmap(tool, event, sample, machine); @@ -1364,7 +1424,8 @@ static int machines__deliver_event(struct machines *m= achines, return tool->aux_output_hw_id(tool, event, sample, machine); case PERF_RECORD_CALLCHAIN_DEFERRED: dump_deferred_callchain(evsel, event, sample); - return tool->callchain_deferred(tool, event, sample, evsel, machine); + return evlist__deliver_deferred_samples(evlist, tool, event, + sample, machine); default: ++evlist->stats.nr_unknown_events; return -1; diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c index e78f16de912e..385043e06627 100644 --- a/tools/perf/util/tool.c +++ b/tools/perf/util/tool.c @@ -238,6 +238,7 @@ void perf_tool__init(struct perf_tool *tool, bool order= ed_events) tool->cgroup_events =3D false; tool->no_warn =3D false; tool->show_feat_hdr =3D SHOW_FEAT_NO_HEADER; + tool->merge_deferred_callchains =3D true; =20 tool->sample =3D process_event_sample_stub; tool->mmap =3D process_event_stub; diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h index 9987bbde6d5e..d06580478ab1 100644 --- a/tools/perf/util/tool.h +++ b/tools/perf/util/tool.h @@ -87,6 +87,7 @@ struct perf_tool { bool cgroup_events; bool no_warn; bool dont_split_sample_group; + bool merge_deferred_callchains; enum show_feature_header show_feat_hdr; }; =20 --=20 2.47.2