From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDED83CF51; Mon, 9 Sep 2024 20:17:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913040; cv=none; b=BtT6LT6j5xtO8OWSh0OyFyeCIjwmltU9xh26L8tzeDe2CYNdr0OVOmPje6R0q8epiDoq7ys5OtpfwPEmKiCvVM6NLq5fcbM3FXbadamtvYRbfVGvhp7uWPUrP8Rd/LtwrtloA/kj0RiINybITa4ogAHPciwUk46c4AlLvDMc2hs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913040; c=relaxed/simple; bh=Cjsx8yUfjrUv/CsLgxDDfjYAujtl84jvF84EtiDCUyY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=F7PPxphErwMPfwlALS6ov62rEF8NDdSPdEmGPMtzxyL0AcvNuCJatrOvDj1asZ7yEIPC/ipqqw//HFJOElMoNasGJAN3uNg3rWxUee4PbQm0+AFwtTEPoHN6OaQB2Cu8WISxblVkq0RQdSPdPlsOoWIa4SuNIctP4vsYiKmlRj0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=F/2N5Oxq; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="F/2N5Oxq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913035; bh=Cjsx8yUfjrUv/CsLgxDDfjYAujtl84jvF84EtiDCUyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=F/2N5OxqL+X+0F/LClQyyMIhY5j6tla8rMAv2DiLrVUnYKHsO4WiXIzdRd5sIHcDL lIzNyxbNbYSC7U4naJipLyLjqVe56OlwRXnpe4Pbx8AFz5Ahu6cEwWCJNAJOnXHqoW RoTDxeoXAgHW958zUa7YNDDP+Iap3otW+T7oTTr1vanghYCGwo3iusVVrQvfikc2ya rbqeIXFQW9VefYw0L5b0RhXvYsX+fSXXLezGRgSEhg7y7pgUJH2PY3heaR6wCnrODo EmvzEFHZ/EybQIfJmf7m8sv/47zUwfCpjbSnghM8fTAlI3+bXableW1QUo811ytBU3 AoKf9wf6h98sA== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRH3qd0z1KSb; Mon, 9 Sep 2024 16:17:15 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 1/8] tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL Date: Mon, 9 Sep 2024 16:16:45 -0400 Message-Id: <20240909201652.319406-2-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for allowing system call tracepoints to handle page faults, introduce TRACE_EVENT_SYSCALL to declare the sys_enter/sys_exit tracepoints. Emit the static inlines register_trace_syscall_##name for events declared with TRACE_EVENT_SYSCALL, allowing source-level validation that only probes meant to handle system call entry/exit events are registered to them. Move the common code between __DECLARE_TRACE and __DECLARE_TRACE_SYSCALL into __DECLARE_TRACE_COMMON. This change is not meant to alter the generated code, and only prepares the following modifications. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Tested-by: Andrii Nakryiko # BPF parts --- include/linux/tracepoint.h | 66 +++++++++++++++++++++++++-------- include/trace/bpf_probe.h | 3 ++ include/trace/define_trace.h | 5 +++ include/trace/events/syscalls.h | 4 +- include/trace/perf.h | 3 ++ include/trace/trace_events.h | 28 ++++++++++++++ kernel/entry/common.c | 4 +- kernel/trace/trace_syscalls.c | 8 ++-- 8 files changed, 98 insertions(+), 23 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 689b6d71590e..b2cfe6a9097c 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -249,10 +249,28 @@ static inline struct tracepoint *tracepoint_ptr_deref= (tracepoint_ptr_t *p) * site if it is not watching, as it will need to be active when the * tracepoint is enabled. */ -#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \ +#define __DECLARE_TRACE_COMMON(name, proto, args, cond, data_proto) \ extern int __traceiter_##name(data_proto); \ DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name); \ extern struct tracepoint __tracepoint_##name; \ + static inline int \ + unregister_trace_##name(void (*probe)(data_proto), void *data) \ + { \ + return tracepoint_probe_unregister(&__tracepoint_##name,\ + (void *)probe, data); \ + } \ + static inline void \ + check_trace_callback_type_##name(void (*cb)(data_proto)) \ + { \ + } \ + static inline bool \ + trace_##name##_enabled(void) \ + { \ + return static_key_false(&__tracepoint_##name.key); \ + } + +#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \ + __DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), cond, PARAMS(da= ta_proto)) \ static inline void trace_##name(proto) \ { \ if (static_key_false(&__tracepoint_##name.key)) \ @@ -264,8 +282,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(= tracepoint_ptr_t *p) "RCU not watching for tracepoint"); \ } \ } \ - __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \ - PARAMS(cond)) \ + static inline void trace_##name##_rcuidle(proto) \ + { \ + if (static_key_false(&__tracepoint_##name.key)) \ + __DO_TRACE(name, \ + TP_ARGS(args), \ + TP_CONDITION(cond), 1); \ + } \ static inline int \ register_trace_##name(void (*probe)(data_proto), void *data) \ { \ @@ -278,21 +301,26 @@ static inline struct tracepoint *tracepoint_ptr_deref= (tracepoint_ptr_t *p) { \ return tracepoint_probe_register_prio(&__tracepoint_##name, \ (void *)probe, data, prio); \ - } \ - static inline int \ - unregister_trace_##name(void (*probe)(data_proto), void *data) \ - { \ - return tracepoint_probe_unregister(&__tracepoint_##name,\ - (void *)probe, data); \ - } \ - static inline void \ - check_trace_callback_type_##name(void (*cb)(data_proto)) \ + } + +#define __DECLARE_TRACE_SYSCALL(name, proto, args, cond, data_proto) \ + __DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), cond, PARAMS(da= ta_proto)) \ + static inline void trace_syscall_##name(proto) \ { \ + if (static_key_false(&__tracepoint_##name.key)) \ + __DO_TRACE(name, \ + TP_ARGS(args), \ + TP_CONDITION(cond), 0); \ + if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ + WARN_ONCE(!rcu_is_watching(), \ + "RCU not watching for tracepoint"); \ + } \ } \ - static inline bool \ - trace_##name##_enabled(void) \ + static inline int \ + register_trace_syscall_##name(void (*probe)(data_proto), void *data) \ { \ - return static_key_false(&__tracepoint_##name.key); \ + return tracepoint_probe_register(&__tracepoint_##name, \ + (void *)probe, data); \ } =20 /* @@ -440,6 +468,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(= tracepoint_ptr_t *p) cpu_online(raw_smp_processor_id()) && (PARAMS(cond)), \ PARAMS(void *__data, proto)) =20 +#define DECLARE_TRACE_SYSCALL(name, proto, args) \ + __DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args), \ + cpu_online(raw_smp_processor_id()), \ + PARAMS(void *__data, proto)) + #define TRACE_EVENT_FLAGS(event, flag) =20 #define TRACE_EVENT_PERF_PERM(event, expr...) @@ -577,6 +610,9 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) struct, assign, print) \ DECLARE_TRACE_CONDITION(name, PARAMS(proto), \ PARAMS(args), PARAMS(cond)) +#define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign, \ + print, reg, unreg) \ + DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args)) =20 #define TRACE_EVENT_FLAGS(event, flag) =20 diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index a2ea11cc912e..c85bbce5aaa5 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -53,6 +53,9 @@ __bpf_trace_##call(void *__data, proto) \ #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h index 00723935dcc7..ff5fa17a6259 100644 --- a/include/trace/define_trace.h +++ b/include/trace/define_trace.h @@ -46,6 +46,10 @@ assign, print, reg, unreg) \ DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args)) =20 +#undef TRACE_EVENT_SYSCALL +#define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign, print, reg,= unreg) \ + DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args)) + #undef TRACE_EVENT_NOP #define TRACE_EVENT_NOP(name, proto, args, struct, assign, print) =20 @@ -107,6 +111,7 @@ #undef TRACE_EVENT #undef TRACE_EVENT_FN #undef TRACE_EVENT_FN_COND +#undef TRACE_EVENT_SYSCALL #undef TRACE_EVENT_CONDITION #undef TRACE_EVENT_NOP #undef DEFINE_EVENT_NOP diff --git a/include/trace/events/syscalls.h b/include/trace/events/syscall= s.h index b6e0cbc2c71f..f31ff446b468 100644 --- a/include/trace/events/syscalls.h +++ b/include/trace/events/syscalls.h @@ -15,7 +15,7 @@ =20 #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS =20 -TRACE_EVENT_FN(sys_enter, +TRACE_EVENT_SYSCALL(sys_enter, =20 TP_PROTO(struct pt_regs *regs, long id), =20 @@ -41,7 +41,7 @@ TRACE_EVENT_FN(sys_enter, =20 TRACE_EVENT_FLAGS(sys_enter, TRACE_EVENT_FL_CAP_ANY) =20 -TRACE_EVENT_FN(sys_exit, +TRACE_EVENT_SYSCALL(sys_exit, =20 TP_PROTO(struct pt_regs *regs, long ret), =20 diff --git a/include/trace/perf.h b/include/trace/perf.h index 2c11181c82e0..ded997af481e 100644 --- a/include/trace/perf.h +++ b/include/trace/perf.h @@ -55,6 +55,9 @@ perf_trace_##call(void *__data, proto) \ head, __task); \ } =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index c2f9cabf154d..8bcbb9ee44de 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -45,6 +45,16 @@ PARAMS(print)); \ DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); =20 +#undef TRACE_EVENT_SYSCALL +#define TRACE_EVENT_SYSCALL(name, proto, args, tstruct, assign, print, reg= , unreg) \ + DECLARE_EVENT_SYSCALL_CLASS(name, \ + PARAMS(proto), \ + PARAMS(args), \ + PARAMS(tstruct), \ + PARAMS(assign), \ + PARAMS(print)); \ + DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); + #include "stages/stage1_struct_define.h" =20 #undef DECLARE_EVENT_CLASS @@ -57,6 +67,9 @@ \ static struct trace_event_class event_class_##name; =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) \ static struct trace_event_call __used \ @@ -117,6 +130,9 @@ tstruct; \ }; =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) =20 @@ -208,6 +224,9 @@ static struct trace_event_functions trace_event_type_fu= ncs_##call =3D { \ .trace =3D trace_raw_output_##call, \ }; =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, call, proto, args, print) \ static notrace enum print_line_t \ @@ -265,6 +284,9 @@ static inline notrace int trace_event_get_offsets_##cal= l( \ return __data_size; \ } =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) =20 /* @@ -409,6 +431,9 @@ trace_event_raw_event_##call(void *__data, proto) \ * fail to compile unless it too is updated. */ =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ static inline void ftrace_test_probe_##call(void) \ @@ -434,6 +459,9 @@ static struct trace_event_class __used __refdata event_= class_##call =3D { \ _TRACE_PERF_INIT(call) \ }; =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ \ diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 90843cc38588..d08472421d0e 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -58,7 +58,7 @@ long syscall_trace_enter(struct pt_regs *regs, long sysca= ll, syscall =3D syscall_get_nr(current, regs); =20 if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) { - trace_sys_enter(regs, syscall); + trace_syscall_sys_enter(regs, syscall); /* * Probes or BPF hooks in the tracepoint may have changed the * system call number as well. @@ -166,7 +166,7 @@ static void syscall_exit_work(struct pt_regs *regs, uns= igned long work) audit_syscall_exit(regs); =20 if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT) - trace_sys_exit(regs, syscall_get_return_value(current, regs)); + trace_syscall_sys_exit(regs, syscall_get_return_value(current, regs)); =20 step =3D report_single_step(work); if (step || work & SYSCALL_WORK_SYSCALL_TRACE) diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 9c581d6da843..067f8e2b930f 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -377,7 +377,7 @@ static int reg_event_syscall_enter(struct trace_event_f= ile *file, return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_enter) - ret =3D register_trace_sys_enter(ftrace_syscall_enter, tr); + ret =3D register_trace_syscall_sys_enter(ftrace_syscall_enter, tr); if (!ret) { rcu_assign_pointer(tr->enter_syscall_files[num], file); tr->sys_refcount_enter++; @@ -415,7 +415,7 @@ static int reg_event_syscall_exit(struct trace_event_fi= le *file, return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_exit) - ret =3D register_trace_sys_exit(ftrace_syscall_exit, tr); + ret =3D register_trace_syscall_sys_exit(ftrace_syscall_exit, tr); if (!ret) { rcu_assign_pointer(tr->exit_syscall_files[num], file); tr->sys_refcount_exit++; @@ -631,7 +631,7 @@ static int perf_sysenter_enable(struct trace_event_call= *call) =20 mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_enter) - ret =3D register_trace_sys_enter(perf_syscall_enter, NULL); + ret =3D register_trace_syscall_sys_enter(perf_syscall_enter, NULL); if (ret) { pr_info("event trace: Could not activate syscall entry trace point"); } else { @@ -728,7 +728,7 @@ static int perf_sysexit_enable(struct trace_event_call = *call) =20 mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_exit) - ret =3D register_trace_sys_exit(perf_syscall_exit, NULL); + ret =3D register_trace_syscall_sys_exit(perf_syscall_exit, NULL); if (ret) { pr_info("event trace: Could not activate syscall exit trace point"); } else { --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BA1A13B2B0; Mon, 9 Sep 2024 20:17:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; cv=none; b=FAI/Qod+cyisxosoHZy6qf2JJwgxrCpvbaD/4QKRNjKlUwbhciO97dvZcBKxcvCgTOoG+e9KOIH6XUf6frcrOWS08ElTYepLjYhR7HL5BCBF2W5tpW2RieV/qFnRtV5s8SmcNzV45eyGrDk6DFHnTamA6LYD8QBKwjD2SmmBNcQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; c=relaxed/simple; bh=L37e79fW1nnCbz9UeSvSJbvkQXXomU2axg08VtYxoxM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L7OCzeOTinDn12yAnUs1peqJ3qXyG5WMIoW/gehFjNxV0Gli0vHe2PTkLYjOGkqapZSC+N+iOG7cT1+jtn7BrAuwFgn/VygSYKCH+n/sYf5oRIFtSs7rzvGFqnp/oZIbZoB4LoPBJNcAuFfAYdFcAYuCqVXCtk/LKhwPdRdw+zw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=jKHrHzdw; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="jKHrHzdw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913036; bh=L37e79fW1nnCbz9UeSvSJbvkQXXomU2axg08VtYxoxM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jKHrHzdwGa3ovEllgxT42j052GD5O3KKyIdpKwu9v5VjrJ1R4hXwzsoQ4EgAzmWEv JlUHBiR9Vf22xx7w1o/KaN6FMo9QPOKXEVasYZLtB/SHFKJxgsJpGGxtbQY52T7PK0 Nas6395QhbGqu+bCjaqMW35sZXRhRsIrAFI+kOcaj0hjDDaA5f4TSfdDWDltPgk7oU excMBgg17dytQXt/jayw4ejiXWMPpkNN+RsCsIFtzBrTf4opPeTiN+nDUvY5HzifGN y2GXN4bqpffPOxzr9gpWrQYzj9HmuHEPoicNosB2ZmV969aXfAQFtzsnyXudc/hziT NPl8Z1r8x55ZQ== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRH6VQCz1Ksw; Mon, 9 Sep 2024 16:17:15 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 2/8] tracing/ftrace: guard syscall probe with preempt_notrace Date: Mon, 9 Sep 2024 16:16:46 -0400 Message-Id: <20240909201652.319406-3-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that ftrace can handle this change by explicitly disabling preemption within the ftrace system call tracepoint probes to respect the current expectations within ftrace ring buffer code. This change does not yet allow ftrace to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Acked-by: Masami Hiramatsu (Google) Tested-by: Andrii Nakryiko # BPF parts --- include/trace/trace_events.h | 38 ++++++++++++++++++++++++++++------- kernel/trace/trace_syscalls.c | 12 +++++++++++ 2 files changed, 43 insertions(+), 7 deletions(-) diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index 8bcbb9ee44de..0228d9ed94a3 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -263,6 +263,9 @@ static struct trace_event_fields trace_event_fields_##c= all[] =3D { \ tstruct \ {} }; =20 +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, name, proto, args, print) =20 @@ -396,11 +399,11 @@ static inline notrace int trace_event_get_offsets_##c= all( \ =20 #include "stages/stage6_event_callback.h" =20 -#undef DECLARE_EVENT_CLASS -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ - \ + +#undef __DECLARE_EVENT_CLASS +#define __DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ static notrace void \ -trace_event_raw_event_##call(void *__data, proto) \ +do_trace_event_raw_event_##call(void *__data, proto) \ { \ struct trace_event_file *trace_file =3D __data; \ struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\ @@ -425,15 +428,34 @@ trace_event_raw_event_##call(void *__data, proto) \ \ trace_event_buffer_commit(&fbuffer); \ } + +#undef DECLARE_EVENT_CLASS +#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ +__DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), PARAMS(tstruct), \ + PARAMS(assign), PARAMS(print)) \ +static notrace void \ +trace_event_raw_event_##call(void *__data, proto) \ +{ \ + do_trace_event_raw_event_##call(__data, args); \ +} + +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS(call, proto, args, tstruct, assign, pr= int) \ +__DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), PARAMS(tstruct), \ + PARAMS(assign), PARAMS(print)) \ +static notrace void \ +trace_event_raw_event_##call(void *__data, proto) \ +{ \ + guard(preempt_notrace)(); \ + do_trace_event_raw_event_##call(__data, args); \ +} + /* * The ftrace_test_probe is compiled out, it is only here as a build time = check * to make sure that if the tracepoint handling changes, the ftrace probe = will * fail to compile unless it too is updated. */ =20 -#undef DECLARE_EVENT_SYSCALL_CLASS -#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS - #undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ static inline void ftrace_test_probe_##call(void) \ @@ -443,6 +465,8 @@ static inline void ftrace_test_probe_##call(void) \ =20 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) =20 +#undef __DECLARE_EVENT_CLASS + #include "stages/stage7_class_define.h" =20 #undef DECLARE_EVENT_CLASS diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 067f8e2b930f..abf0e0b7cd0b 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -299,6 +299,12 @@ static void ftrace_syscall_enter(void *data, struct pt= _regs *regs, long id) int syscall_nr; int size; =20 + /* + * Syscall probe called with preemption enabled, but the ring + * buffer and per-cpu data require preemption to be disabled. + */ + guard(preempt_notrace)(); + syscall_nr =3D trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >=3D NR_syscalls) return; @@ -338,6 +344,12 @@ static void ftrace_syscall_exit(void *data, struct pt_= regs *regs, long ret) struct trace_event_buffer fbuffer; int syscall_nr; =20 + /* + * Syscall probe called with preemption enabled, but the ring + * buffer and per-cpu data require preemption to be disabled. + */ + guard(preempt_notrace)(); + syscall_nr =3D trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >=3D NR_syscalls) return; --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CDF416EB55; Mon, 9 Sep 2024 20:17:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; cv=none; b=PhU8F8xW0IRmwCqUjCF4NVBVvmpXOyMQoC0rwxkrq9sNrbMIcC09IH3UVqoIBMTa0JMtLUVJaqogYKCMBZ70DQxRaWyTPCrmLCDHRBFqWS2N9B4u74GZVuKuQXo1n/aq1TDwbWzAZWWDAxfdIP+lgUkAb3AY+VusF2MOCswIjWQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; c=relaxed/simple; bh=MWuMSBa0wltjyl8Q6xHGNS8TrHMMgd8rkPqmN7QbQzs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sRpt3sy1btPMj6z/vzBBvPVCT9KwrtG7ra5tSjYGuHwCHP/Pu4i6egj7VJe3EIKsh582rJh0Az65bNvCPgRhb9XPLcXJe86JinTCml9huiQB7k2f60WTYbcxN27zUyeDaCiZicl256mGFygtSoQSm2ZJFfLPl+PJkTKxYiOEKpo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=tgLXp5RN; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="tgLXp5RN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913036; bh=MWuMSBa0wltjyl8Q6xHGNS8TrHMMgd8rkPqmN7QbQzs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tgLXp5RN8e3oKPyxKcUJVJvn95spYGTjLaPySxOBYjhob2svcxRQW7QbugmeXVMI0 tY2YTPeZrUNAsIoHgLjaNxjcot4xAupYjTuCN+JDYLNaPn7eXod2HZVGIfB8yW8oe8 4oMmbWBLWqxRip/JWoqRjwjEIxnz3IFZgTlf1NPczi88jcXSEstP/lu921OfJRz6hR S4QCvJA6I3XCBVCZa0t2zVFG3270S/nRMbNq0UBPLuuGUSzEau6fCmxbGi//RTFItR wg+d+EQ0su8EWmc6vZI6vI6faqcksE4vIubfYoCDKlIduT2aTnN4JqKVxwUJtwjmwY LpoIEMx+P/Y2w== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRJ23Pbz1KQX; Mon, 9 Sep 2024 16:17:16 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 3/8] tracing/perf: guard syscall probe with preempt_notrace Date: Mon, 9 Sep 2024 16:16:47 -0400 Message-Id: <20240909201652.319406-4-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that perf can handle this change by explicitly disabling preemption within the perf system call tracepoint probes to respect the current expectations within perf ring buffer code. This change does not yet allow perf to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Tested-by: Andrii Nakryiko # BPF parts --- include/trace/perf.h | 41 +++++++++++++++++++++++++++++++---- kernel/trace/trace_syscalls.c | 12 ++++++++++ 2 files changed, 49 insertions(+), 4 deletions(-) diff --git a/include/trace/perf.h b/include/trace/perf.h index ded997af481e..5650c1bad088 100644 --- a/include/trace/perf.h +++ b/include/trace/perf.h @@ -12,10 +12,10 @@ #undef __perf_task #define __perf_task(t) (__task =3D (t)) =20 -#undef DECLARE_EVENT_CLASS -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ +#undef __DECLARE_EVENT_CLASS +#define __DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ static notrace void \ -perf_trace_##call(void *__data, proto) \ +do_perf_trace_##call(void *__data, proto) \ { \ struct trace_event_call *event_call =3D __data; \ struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\ @@ -55,8 +55,38 @@ perf_trace_##call(void *__data, proto) \ head, __task); \ } =20 +/* + * Define unused __count and __task variables to use @args to pass + * arguments to do_perf_trace_##call. This is needed because the + * macros __perf_count and __perf_task introduce the side-effect to + * store copies into those local variables. + */ +#undef DECLARE_EVENT_CLASS +#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ +__DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), PARAMS(tstruct), \ + PARAMS(assign), PARAMS(print)) \ +static notrace void \ +perf_trace_##call(void *__data, proto) \ +{ \ + u64 __count __attribute__((unused)); \ + struct task_struct *__task __attribute__((unused)); \ + \ + do_perf_trace_##call(__data, args); \ +} + #undef DECLARE_EVENT_SYSCALL_CLASS -#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS(call, proto, args, tstruct, assign, pr= int) \ +__DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), PARAMS(tstruct), \ + PARAMS(assign), PARAMS(print)) \ +static notrace void \ +perf_trace_##call(void *__data, proto) \ +{ \ + u64 __count __attribute__((unused)); \ + struct task_struct *__task __attribute__((unused)); \ + \ + guard(preempt_notrace)(); \ + do_perf_trace_##call(__data, args); \ +} =20 /* * This part is compiled out, it is only here as a build time check @@ -76,4 +106,7 @@ static inline void perf_test_probe_##call(void) \ DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) =20 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) + +#undef __DECLARE_EVENT_CLASS + #endif /* CONFIG_PERF_EVENTS */ diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index abf0e0b7cd0b..a3d8ac00793e 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -594,6 +594,12 @@ static void perf_syscall_enter(void *ignore, struct pt= _regs *regs, long id) int rctx; int size; =20 + /* + * Syscall probe called with preemption enabled, but the ring + * buffer and per-cpu data require preemption to be disabled. + */ + guard(preempt_notrace)(); + syscall_nr =3D trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >=3D NR_syscalls) return; @@ -694,6 +700,12 @@ static void perf_syscall_exit(void *ignore, struct pt_= regs *regs, long ret) int rctx; int size; =20 + /* + * Syscall probe called with preemption enabled, but the ring + * buffer and per-cpu data require preemption to be disabled. + */ + guard(preempt_notrace)(); + syscall_nr =3D trace_get_syscall_nr(current, regs); if (syscall_nr < 0 || syscall_nr >=3D NR_syscalls) return; --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDBDA17D34D; Mon, 9 Sep 2024 20:17:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; cv=none; b=iJbL2us3F18EvzfrfyFV7QozJj1Mt+O5+8f4a/xIjn7hIu1r7bv4cB1P5bAxmx1sZOq/TZrWFY/dmZCEVijyCZUuFVbg/2EsP+VlpOlaO8RkkYkkgm5W326bZ9k+DC83tHIlYGwX43L9oRvpaG7MK8GpgoabU4BDcf371zVTlI0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913039; c=relaxed/simple; bh=gZzsG8gL/ErZCAElJk9KWuSYSEgbv2rStlG0oF8Ol54=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZUmNIDio2J2kgGs/hFIeDC8X0Jp4rU/AZwaOdkJ3jub+Noj7TJLaInml0LXGs1ZaN7OghLotsUQIFhQM+OEZMrsxCO/sHOWgjbcruL+yoWtGxhFdTltGABzqQq9F0dKuxYP1Hpsm2lSZbUO4z4rnzAvrxmvGzNrxAydEZdpUHCI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=Vfow65t8; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="Vfow65t8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913036; bh=gZzsG8gL/ErZCAElJk9KWuSYSEgbv2rStlG0oF8Ol54=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Vfow65t8A1jy2u7v7GVENaO55NLiN2wwLYGszpjfoA8lYMsJfPEV/1jVHoDptb7dX OWHk/cEb1Q3AiCCqD1IT678F6wJ7rCQthZ7e+SblHGRabqb4wudu7yOJkq+ixC7PrW A+oNy9yIPlCChhuRqGK89C1n6+L08wUp4q8gYqbpHpMkjuJQ90TsAIsGQFzNowrvr0 t8Q2k/X1PoOIsj862/vpd3JJIK7fT3EGhlQJqBglCRVyUKg/Bdn8NGw8Q8C5k/janp Arav1oxzlIYPzyRs34q4iONXqDwUUq2L/5FBer155OKHqFwjKuRTSciXK97sJe0Wzo wTdq9FHvXkMMw== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRJ4kDHz1KQY; Mon, 9 Sep 2024 16:17:16 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 4/8] tracing/bpf: guard syscall probe with preempt_notrace Date: Mon, 9 Sep 2024 16:16:48 -0400 Message-Id: <20240909201652.319406-5-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that bpf can handle this change by explicitly disabling preemption within the bpf system call tracepoint probes to respect the current expectations within bpf tracing code. This change does not yet allow bpf to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Acked-by: Andrii Nakryiko Tested-by: Andrii Nakryiko # BPF parts --- include/trace/bpf_probe.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index c85bbce5aaa5..211b98d45fc6 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -53,8 +53,17 @@ __bpf_trace_##call(void *__data, proto) \ #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) =20 +#define __BPF_DECLARE_TRACE_SYSCALL(call, proto, args) \ +static notrace void \ +__bpf_trace_##call(void *__data, proto) \ +{ \ + guard(preempt_notrace)(); \ + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \ +} + #undef DECLARE_EVENT_SYSCALL_CLASS -#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS(call, proto, args, tstruct, assign, pr= int) \ + __BPF_DECLARE_TRACE_SYSCALL(call, PARAMS(proto), PARAMS(args)) =20 /* * This part is compiled out, it is only here as a build time check --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4F5418A943; Mon, 9 Sep 2024 20:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; cv=none; b=eI6hxEKd9PitrQq12lfk52kp5wW6YJkxvw3GkUdzd6adDX4LQPUA45HtVxLTX6DiKByiyFkwNQvV8n+IF16tVfncy2PiNgpXok5kI2jaJTk+GvVgqozp/5wl+fxGurtFxoqhqIGQWxN5er81hH2fpl6p5jlV+kf0D8OoprlLR7U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; c=relaxed/simple; bh=hJcVhbeaj6k9ErC0FjHzdoTIbfw2Pys8nh2Pm8fssdY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=eiCbWy8KnBcbgAJ++na2qD4DE7GU9cXs7/Rj82VcmECO1SQJRgrb7hNRP010uuibTmC9fY58d5SBHJvfdVwo73fy60562RWRZ0DTf/x/iQo1bxnmTzaHPB+Epvp7dv3InfYa8wC+GftBGlYt9EoiqpzGPFCD0upT5npYRLDB3s0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=J1ieXpQQ; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="J1ieXpQQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913037; bh=hJcVhbeaj6k9ErC0FjHzdoTIbfw2Pys8nh2Pm8fssdY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=J1ieXpQQeBOEoneKLX7KmFdRWkXK5OfDwToyFYBCndPOzNulrNjI+c7oNUUuGMHdR D+CzERgrauyAf2JFZULPIZOWiPRVEF/da0QwP/tVLDJm6kLojl/QgV/9GfVtNXKwe4 CBI7mNz71DMqlPK+jM7rhY1/fyW3TJMucrHrIarJ3toRiaTU6PrMHhLcDhVtPrU6nK 6bBVeRcvjE9NZOTht6KYkp6xfZwmRoRKi1nlGUHExA5UhNc9Y+25PApM9DzQus3fvu umFXSrYlbxMurNFAXc6DumSfl+9U+LrZFQQm+FWB1SfDYp8HgtCUYwuIlcYccVOdjh Ph/H1fVaBvHTQ== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRK0HzHz1KSc; Mon, 9 Sep 2024 16:17:17 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 5/8] tracing: Allow system call tracepoints to handle page faults Date: Mon, 9 Sep 2024 16:16:49 -0400 Message-Id: <20240909201652.319406-6-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use Tasks Trace RCU to protect iteration of system call enter/exit tracepoint probes to allow those probes to handle page faults. In preparation for this change, all tracers registering to system call enter/exit tracepoints should expect those to be called with preemption enabled. This allows tracers to fault-in userspace system call arguments such as path strings within their probe callbacks. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Tested-by: Andrii Nakryiko # BPF parts --- include/linux/tracepoint.h | 25 +++++++++++++++++-------- init/Kconfig | 1 + 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index b2cfe6a9097c..005ede3b1afa 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -18,6 +18,7 @@ #include #include #include +#include #include #include =20 @@ -90,6 +91,7 @@ int unregister_tracepoint_module_notifier(struct notifier= _block *nb) #ifdef CONFIG_TRACEPOINTS static inline void tracepoint_synchronize_unregister(void) { + synchronize_rcu_tasks_trace(); synchronize_srcu(&tracepoint_srcu); synchronize_rcu(); } @@ -192,7 +194,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) * it_func[0] is never NULL because there is at least one element in the a= rray * when the array itself is non NULL. */ -#define __DO_TRACE(name, args, cond, rcuidle) \ +#define __DO_TRACE(name, args, cond, rcuidle, syscall) \ do { \ int __maybe_unused __idx =3D 0; \ \ @@ -203,8 +205,12 @@ static inline struct tracepoint *tracepoint_ptr_deref(= tracepoint_ptr_t *p) "Bad RCU usage for tracepoint")) \ return; \ \ - /* keep srcu and sched-rcu usage consistent */ \ - preempt_disable_notrace(); \ + if (syscall) { \ + rcu_read_lock_trace(); \ + } else { \ + /* keep srcu and sched-rcu usage consistent */ \ + preempt_disable_notrace(); \ + } \ \ /* \ * For rcuidle callers, use srcu since sched-rcu \ @@ -222,7 +228,10 @@ static inline struct tracepoint *tracepoint_ptr_deref(= tracepoint_ptr_t *p) srcu_read_unlock_notrace(&tracepoint_srcu, __idx);\ } \ \ - preempt_enable_notrace(); \ + if (syscall) \ + rcu_read_unlock_trace(); \ + else \ + preempt_enable_notrace(); \ } while (0) =20 #ifndef MODULE @@ -232,7 +241,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) if (static_key_false(&__tracepoint_##name.key)) \ __DO_TRACE(name, \ TP_ARGS(args), \ - TP_CONDITION(cond), 1); \ + TP_CONDITION(cond), 1, 0); \ } #else #define __DECLARE_TRACE_RCU(name, proto, args, cond) @@ -276,7 +285,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) if (static_key_false(&__tracepoint_##name.key)) \ __DO_TRACE(name, \ TP_ARGS(args), \ - TP_CONDITION(cond), 0); \ + TP_CONDITION(cond), 0, 0); \ if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ WARN_ONCE(!rcu_is_watching(), \ "RCU not watching for tracepoint"); \ @@ -287,7 +296,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) if (static_key_false(&__tracepoint_##name.key)) \ __DO_TRACE(name, \ TP_ARGS(args), \ - TP_CONDITION(cond), 1); \ + TP_CONDITION(cond), 1, 0); \ } \ static inline int \ register_trace_##name(void (*probe)(data_proto), void *data) \ @@ -310,7 +319,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(t= racepoint_ptr_t *p) if (static_key_false(&__tracepoint_##name.key)) \ __DO_TRACE(name, \ TP_ARGS(args), \ - TP_CONDITION(cond), 0); \ + TP_CONDITION(cond), 0, 1); \ if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ WARN_ONCE(!rcu_is_watching(), \ "RCU not watching for tracepoint"); \ diff --git a/init/Kconfig b/init/Kconfig index d8a971b804d3..c854b2887e7f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1937,6 +1937,7 @@ config BINDGEN_VERSION_TEXT # config TRACEPOINTS bool + select TASKS_TRACE_RCU =20 source "kernel/Kconfig.kexec" =20 --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4F9F18A947; Mon, 9 Sep 2024 20:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; cv=none; b=tkt/2uDooZdYSbBOwDRxV2/NROoe2gN/LmzgqkoJx9qOCuQNmk1KzReIeKQgQEHvCPioHZZjvxqCjhkv+29CEuoQH5cVhPqYudE2k1Pir/W9meHhec48czmm9EsQTEM8+TN9QvDEEIZO+ZCKU5XObhMP3Y+ZYlH9GVVRlq48ndY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; c=relaxed/simple; bh=GrNoMikzvtGlo0BfQyeaewle2MqoaAopJOxq5AmF0YI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=US9tShIQiTv7RMBGseapKtQ0bhf6/ze775F6LtuvqqNoDgjiYJadNJ4WXVGleAHTSY4LGYaYfabbfhBoQoKFYefXYqv/PIL7BN5IFi559V79hizO8k+kHQEL6i+OY+v+rqyvIKdozeiq69FKCxbwpavI9vkAEubMz49v4lcb+9E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=D4qr8Bk+; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="D4qr8Bk+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913037; bh=GrNoMikzvtGlo0BfQyeaewle2MqoaAopJOxq5AmF0YI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=D4qr8Bk+vbOIpPGXW5mdB0P88jMAHQUFe+uiTBsb5v4y+AZGDB3BqAxrfhhvYn6GD ts6a7LbN/sBH4D+1vVeSZztaVCoS37y6cUhh/ODJzzretmaynOc0DnU5hJbr8FclwI 0KX81przBZCj6NlyWMBa/JkJbzPBjjLT3j8uoCXN5jc9hzGQ7TcI75eLOGbL6NSRmW pCgb9c40TJKYybJj6FJrRMH0Lz0XkeACHC1ujhgX0ALmFPM68LNagurXadZ3GlbsK7 75bYwNlNLlHPV4VMbLFGlqPRJe8qWwNgSo48ARVJSox+tzVouEZICo438ewVJkCZ1K 8PXGpYgbq3gww== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRK2nhrz1Kfc; Mon, 9 Sep 2024 16:17:17 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 6/8] tracing/ftrace: Add might_fault check to syscall probes Date: Mon, 9 Sep 2024 16:16:50 -0400 Message-Id: <20240909201652.319406-7-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a might_fault() check to validate that the ftrace sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Tested-by: Andrii Nakryiko # BPF parts --- include/trace/trace_events.h | 1 + kernel/trace/trace_syscalls.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index 0228d9ed94a3..e0d4850b0d77 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -446,6 +446,7 @@ __DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args)= , PARAMS(tstruct), \ static notrace void \ trace_event_raw_event_##call(void *__data, proto) \ { \ + might_fault(); \ guard(preempt_notrace)(); \ do_trace_event_raw_event_##call(__data, args); \ } diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index a3d8ac00793e..0430890cbb42 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -303,6 +303,7 @@ static void ftrace_syscall_enter(void *data, struct pt_= regs *regs, long id) * Syscall probe called with preemption enabled, but the ring * buffer and per-cpu data require preemption to be disabled. */ + might_fault(); guard(preempt_notrace)(); =20 syscall_nr =3D trace_get_syscall_nr(current, regs); @@ -348,6 +349,7 @@ static void ftrace_syscall_exit(void *data, struct pt_r= egs *regs, long ret) * Syscall probe called with preemption enabled, but the ring * buffer and per-cpu data require preemption to be disabled. */ + might_fault(); guard(preempt_notrace)(); =20 syscall_nr =3D trace_get_syscall_nr(current, regs); --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7B8318A94C; Mon, 9 Sep 2024 20:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913043; cv=none; b=CpFEW2wg40RbBjlL0pgbhVrjG8jwF7+imYg3yWVGkVQoesQA+lVIyOTrzVto7MlfGLOrNyLi8tJCwuEiTm4FeZRntSalJ6V4ziHkb1iAdwI1Cn+sym9JCbRjZvIbDXzsvD4KpIWtCzyZZ8uw2TszdT/R82i0wOPjKaNDxw7A6lY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913043; c=relaxed/simple; bh=uJq5iVZHWqF4O/NYGe5n5RCWahHYrd6IuFGzrAbE8HA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n3IX58NeBkO1jfLlmsnRFE7zemkkeDV+8PqGUbai8ELv9+b2ELPqB3brZj/e+/kSHoDG5Kl7xkoDgs5PFNhgUviWfd6ZpvtlEkAQMh268c/BjSRJ6Taw2Hiz+V8y8LFJ1HUSSGYRVXuPwv1K3zKm1pgCAjUazN+1CwYkehi6Pgo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=ncuHpAKi; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="ncuHpAKi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913038; bh=uJq5iVZHWqF4O/NYGe5n5RCWahHYrd6IuFGzrAbE8HA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ncuHpAKiQfumGWJAbDQIvuzPkv1Z53jIBpwWI7c13yOWIUsN9YbrY4R0tBBkkJVQx ircMRIIvTkPdHtESew4A0PcIa6I8RJi+BakYvq6wLfm0otntEUA2b6g4F48b1ktRaP /Aa6cY9mNwSoAi7YXx/X0bY4O7IZIQIEdwzUpzXQVYzDBdR0Gk4SYZ2xQLzLmC2tEc SPIvMH9pZja73R+WzWvOEEAiEdYAs6toE5wqtQWf14it8SFQBiTmH9zKLE67G5W5m/ /ZFcq///C8QiXS02LrsOZEBTIefxia/Vcn22yxsFUuf4mkD4ZG/o2HLTRkuBS5oPng RRpDTa9AdoWwA== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRK5MD6z1KhL; Mon, 9 Sep 2024 16:17:17 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 7/8] tracing/perf: Add might_fault check to syscall probes Date: Mon, 9 Sep 2024 16:16:51 -0400 Message-Id: <20240909201652.319406-8-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a might_fault() check to validate that the perf sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Tested-by: Andrii Nakryiko # BPF parts --- include/trace/perf.h | 1 + kernel/trace/trace_syscalls.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/trace/perf.h b/include/trace/perf.h index 5650c1bad088..321bfd7919f6 100644 --- a/include/trace/perf.h +++ b/include/trace/perf.h @@ -84,6 +84,7 @@ perf_trace_##call(void *__data, proto) \ u64 __count __attribute__((unused)); \ struct task_struct *__task __attribute__((unused)); \ \ + might_fault(); \ guard(preempt_notrace)(); \ do_perf_trace_##call(__data, args); \ } diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 0430890cbb42..53faa791c735 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -600,6 +600,7 @@ static void perf_syscall_enter(void *ignore, struct pt_= regs *regs, long id) * Syscall probe called with preemption enabled, but the ring * buffer and per-cpu data require preemption to be disabled. */ + might_fault(); guard(preempt_notrace)(); =20 syscall_nr =3D trace_get_syscall_nr(current, regs); @@ -706,6 +707,7 @@ static void perf_syscall_exit(void *ignore, struct pt_r= egs *regs, long ret) * Syscall probe called with preemption enabled, but the ring * buffer and per-cpu data require preemption to be disabled. */ + might_fault(); guard(preempt_notrace)(); =20 syscall_nr =3D trace_get_syscall_nr(current, regs); --=20 2.39.2 From nobody Sat Nov 30 10:40:18 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5BB918A949; Mon, 9 Sep 2024 20:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; cv=none; b=MWqEf895b4NfgZ41ZwMa8pdIS0P3xCYbLXV4JJ41zqWUyglC1K3Uvuzy2KxzxjSDnCHhEHpKm59u04AI3wVUgn1Ph4ds7KvcfVBFbtyWZVpVkwPS6Ha4kq1EM4NEk6zks8E1Stvupo+cejumKHMYGwAcRFO9SI/9eT8xC//pILk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725913041; c=relaxed/simple; bh=5w5atC/L2ENxXyVQRtgR6s4glzcuw7sYXRIo0prEmhc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d1ZUJ8X5DWpWowOJ7Vkio6+HRpIP9nYIQsMuZHZlkOzWfEdNQVjFEKkPyBj9kaLB8EiT5jOJ3MFAwzYyAMeh5bbkKEHv79u5o65G9V2avD8AFfYdUhPyrBFZ5rdiMhZP9gtkB3a5LVjyOv5zg0dbCCKeqN01CJrAZteFrx7NMZE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=oFRABRR0; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="oFRABRR0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1725913038; bh=5w5atC/L2ENxXyVQRtgR6s4glzcuw7sYXRIo0prEmhc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oFRABRR0Xuxpf1DM28exZ+a9fqhCTt0grXZL9/gPRSA3Gv+WVM0ve4eSeipFFMYjO S4n41aya4Nl00Yl8G5pgNkvYn8oj+GixaVsMMCDhlt3IYpvyJ/ESONgy/6YPRHpxW9 ZgyUHnl6nWnKBlHj9Wmh3UhhKwxSOXpHdbv2QJMvpyNMV1O4YqFN4AIRHUqkCP2cQc NQG6aN+FxrrsPj6PlEsZAGT4C78HPrOw8Drpckj4AJ5WgOo2vlf/obIuihGVGARPLJ byIBwqXT5BNlCcxT88EUuBeDMKKeehYcC+WxFCOxEe5iLzgTcW23tuhosVfcaWdF69 BQJMgwhRGtmHw== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4X2dRL0vTVz1KwG; Mon, 9 Sep 2024 16:17:18 -0400 (EDT) From: Mathieu Desnoyers To: Steven Rostedt , Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Alexei Starovoitov , Yonghong Song , "Paul E . McKenney" , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Namhyung Kim , Andrii Nakryiko , bpf@vger.kernel.org, Joel Fernandes , linux-trace-kernel@vger.kernel.org, Michael Jeanson Subject: [PATCH 8/8] tracing/bpf: Add might_fault check to syscall probes Date: Mon, 9 Sep 2024 16:16:52 -0400 Message-Id: <20240909201652.319406-9-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> References: <20240909201652.319406-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a might_fault() check to validate that the bpf sys_enter/sys_exit probe callbacks are indeed called from a context where page faults can be handled. Signed-off-by: Mathieu Desnoyers Cc: Michael Jeanson Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Alexei Starovoitov Cc: Yonghong Song Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Namhyung Kim Cc: Andrii Nakryiko Cc: bpf@vger.kernel.org Cc: Joel Fernandes Acked-by: Andrii Nakryiko Tested-by: Andrii Nakryiko # BPF parts --- include/trace/bpf_probe.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index 211b98d45fc6..099df5c3e38a 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -57,6 +57,7 @@ __bpf_trace_##call(void *__data, proto) \ static notrace void \ __bpf_trace_##call(void *__data, proto) \ { \ + might_fault(); \ guard(preempt_notrace)(); \ CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \ } --=20 2.39.2