[PATCH resend 6/8] tracing/ftrace: Add might_fault check to syscall probes

Mathieu Desnoyers posted 8 patches 1 month, 4 weeks ago
There is a newer version of this series
[PATCH resend 6/8] tracing/ftrace: Add might_fault check to syscall probes
Posted by Mathieu Desnoyers 1 month, 4 weeks ago
Add a might_fault() check to validate that the ftrace sys_enter/sys_exit
probe callbacks are indeed called from a context where page faults can
be handled.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Michael Jeanson <mjeanson@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org
Cc: Joel Fernandes <joel@joelfernandes.org>
---
 include/trace/trace_events.h  | 1 +
 kernel/trace/trace_syscalls.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
index 0228d9ed94a3..e0d4850b0d77 100644
--- a/include/trace/trace_events.h
+++ b/include/trace/trace_events.h
@@ -446,6 +446,7 @@ __DECLARE_EVENT_CLASS(call, PARAMS(proto), PARAMS(args), PARAMS(tstruct), \
 static notrace void							\
 trace_event_raw_event_##call(void *__data, proto)			\
 {									\
+	might_fault();							\
 	guard(preempt_notrace)();					\
 	do_trace_event_raw_event_##call(__data, args);			\
 }
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a3d8ac00793e..0430890cbb42 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -303,6 +303,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
 	 * Syscall probe called with preemption enabled, but the ring
 	 * buffer and per-cpu data require preemption to be disabled.
 	 */
+	might_fault();
 	guard(preempt_notrace)();
 
 	syscall_nr = trace_get_syscall_nr(current, regs);
@@ -348,6 +349,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
 	 * Syscall probe called with preemption enabled, but the ring
 	 * buffer and per-cpu data require preemption to be disabled.
 	 */
+	might_fault();
 	guard(preempt_notrace)();
 
 	syscall_nr = trace_get_syscall_nr(current, regs);
-- 
2.39.2
Re: [PATCH resend 6/8] tracing/ftrace: Add might_fault check to syscall probes
Posted by Thomas Gleixner 1 month ago
On Mon, Sep 30 2024 at 15:23, Mathieu Desnoyers wrote:
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index a3d8ac00793e..0430890cbb42 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -303,6 +303,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
>  	 * Syscall probe called with preemption enabled, but the ring
>  	 * buffer and per-cpu data require preemption to be disabled.
>  	 */
> +	might_fault();
>  	guard(preempt_notrace)();

I find it odd that the might_fault() check is in all the implementations
and not in the tracepoint itself:

    if (syscall) {
        might_fault();
 	rcu_read_unlock_trace();
   } else ...     

That's where I would have expected it to be.

Thanks,

        tglx
Re: [PATCH resend 6/8] tracing/ftrace: Add might_fault check to syscall probes
Posted by Mathieu Desnoyers 1 month ago
On 2024-10-28 13:42, Thomas Gleixner wrote:
> On Mon, Sep 30 2024 at 15:23, Mathieu Desnoyers wrote:
>> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
>> index a3d8ac00793e..0430890cbb42 100644
>> --- a/kernel/trace/trace_syscalls.c
>> +++ b/kernel/trace/trace_syscalls.c
>> @@ -303,6 +303,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
>>   	 * Syscall probe called with preemption enabled, but the ring
>>   	 * buffer and per-cpu data require preemption to be disabled.
>>   	 */
>> +	might_fault();
>>   	guard(preempt_notrace)();
> 
> I find it odd that the might_fault() check is in all the implementations
> and not in the tracepoint itself:
> 
>      if (syscall) {
>          might_fault();
>   	rcu_read_unlock_trace();
>     } else ...
> 
> That's where I would have expected it to be.

You raise a good point: we should also add a might_fault() check in
__DO_TRACE() in the syscall case, so we can catch incorrect use of the
syscall tracepoint even if no probes are registered to it.

I've added the might_fault() in each tracer syscall probe to make sure
a tracer don't end up registering a faultable probe on a tracepoint
protected with preempt_disable by mistake. It validates that the tracers
are using the tracepoint registration as expected.

I'll prepare separate a patch adding this and will add it to this
series.

Thanks,

Mathieu

> 
> Thanks,
> 
>          tglx

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com