[GIT PULL] tracing: Fixes for 7.0

Steven Rostedt posted 1 patch 1 month, 1 week ago
There is a newer version of this series
include/linux/ftrace.h           | 13 ++++++++++---
include/linux/trace_events.h     |  5 +++++
kernel/trace/fgraph.c            | 12 +++++++++++-
kernel/trace/ring_buffer.c       |  9 +++++++--
kernel/trace/trace_events.c      |  3 +++
kernel/trace/trace_events_hist.c |  4 ++--
6 files changed, 38 insertions(+), 8 deletions(-)
[GIT PULL] tracing: Fixes for 7.0
Posted by Steven Rostedt 1 month, 1 week ago

Linus,

Tracing fixes for 7.0:

- Fix possible dereference of uninitialized pointer

  When validating the persistent ring buffer on boot up, if the first
  validation fails, a reference to "head_page" is performed in the
  error path, but it skips over the initialization of that variable.
  Move the initialization before the first validation check.

- Fix use of event length in validation of persistent ring buffer

  On boot up, the persistent ring buffer is checked to see if it is
  valid by several methods. One being to walk all the events in the
  memory location to make sure they are all valid. The length of the
  event is used to move to the next event. This length is determined
  by the data in the buffer. If that length is corrupted, it could
  possibly make the next event to check located at a bad memory location.

  Validate the length field of the event when doing the event walk.

- Fix function graph on archs that do not support use of ftrace_ops

  When an architecture defines HAVE_DYNAMIC_FTRACE_WITH_ARGS, it means
  that its function graph tracer uses the ftrace_ops of the function
  tracer to call its callbacks. This allows a single registered callback
  to be called directly instead of checking the callback's meta data's
  hash entries against the function being traced.

  For architectures that do not support this feature, it must always
  call the loop function that tests each registered callback (even if
  there's only one). The loop function tests each callback's meta data
  against its hash of functions and will call its callback if the
  function being traced is in its hash map.

  The issue was that there was no check against this and the direct
  function was being called even if the architecture didn't support it.
  This meant that if function tracing was enabled at the same time
  as a callback was registered with the function graph tracer, its
  callback would be called for every function that the function tracer
  also traced, even if the callback's meta data only wanted to be
  called back for a small subset of functions.

  Prevent the direct calling for those architectures that do not support
  it.

- Fix references to trace_event_file for hist files

  The hist files used event_file_data() to get a reference to the
  associated trace_event_file the histogram was attached to. This
  would return a pointer even if the trace_event_file is about to
  be freed (via RCU). Instead it should use the event_file_file()
  helper that returns NULL if the trace_event_file is marked to be
  freed so that no new references are added to it.

- Wake up hist poll readers when an event is being freed

  When polling on a hist file, the task is only awoken when a hist
  trigger is triggered. This means that if an event is being freed
  while there's a task waiting on its hist file, it will need to wait
  until the hist trigger occurs to wake it up and allow the freeing
  to happen. Note, the event will not be completely freed until all
  references are removed, and a hist poller keeps a reference. But
  it should still be woken when the event is being freed.


Please pull the latest trace-v7.0-2 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
trace-v7.0-2

Tag SHA1: d356215e804503d1ba58c3dfbeda19d549635c29
Head SHA1: 9678e53179aa7e907360f5b5b275769008a69b80


Daniil Dulov (1):
      ring-buffer: Fix possible dereference of uninitialized pointer

Masami Hiramatsu (Google) (1):
      tracing: ring-buffer: Fix to check event length before using

Petr Pavlu (2):
      tracing: Fix checking of freed trace_event_file for hist files
      tracing: Wake up poll waiters for hist files when removing an event

Steven Rostedt (1):
      fgraph: Do not call handlers direct when not using ftrace_ops

----
 include/linux/ftrace.h           | 13 ++++++++++---
 include/linux/trace_events.h     |  5 +++++
 kernel/trace/fgraph.c            | 12 +++++++++++-
 kernel/trace/ring_buffer.c       |  9 +++++++--
 kernel/trace/trace_events.c      |  3 +++
 kernel/trace/trace_events_hist.c |  4 ++--
 6 files changed, 38 insertions(+), 8 deletions(-)
---------------------------
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1a4d36fc9085..c242fe49af4c 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1092,10 +1092,17 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 #ifndef ftrace_graph_func
-#define ftrace_graph_func ftrace_stub
-#define FTRACE_OPS_GRAPH_STUB FTRACE_OPS_FL_STUB
+# define ftrace_graph_func ftrace_stub
+# define FTRACE_OPS_GRAPH_STUB FTRACE_OPS_FL_STUB
+/*
+ * The function graph is called every time the function tracer is called.
+ * It must always test the ops hash and cannot just directly call
+ * the handler.
+ */
+# define FGRAPH_NO_DIRECT	1
 #else
-#define FTRACE_OPS_GRAPH_STUB 0
+# define FTRACE_OPS_GRAPH_STUB	0
+# define FGRAPH_NO_DIRECT	0
 #endif
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 0a2b8229b999..37eb2f0f3dd8 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -683,6 +683,11 @@ static inline void hist_poll_wakeup(void)
 
 #define hist_poll_wait(file, wait)	\
 	poll_wait(file, &hist_poll_wq, wait)
+
+#else
+static inline void hist_poll_wakeup(void)
+{
+}
 #endif
 
 #define __TRACE_EVENT_FLAGS(name, value)				\
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 4df766c690f9..40d373d65f9b 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -539,7 +539,11 @@ static struct fgraph_ops fgraph_stub = {
 static struct fgraph_ops *fgraph_direct_gops = &fgraph_stub;
 DEFINE_STATIC_CALL(fgraph_func, ftrace_graph_entry_stub);
 DEFINE_STATIC_CALL(fgraph_retfunc, ftrace_graph_ret_stub);
+#if FGRAPH_NO_DIRECT
+static DEFINE_STATIC_KEY_FALSE(fgraph_do_direct);
+#else
 static DEFINE_STATIC_KEY_TRUE(fgraph_do_direct);
+#endif
 
 /**
  * ftrace_graph_stop - set to permanently disable function graph tracing
@@ -843,7 +847,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
 	bitmap = get_bitmap_bits(current, offset);
 
 #ifdef CONFIG_HAVE_STATIC_CALL
-	if (static_branch_likely(&fgraph_do_direct)) {
+	if (!FGRAPH_NO_DIRECT && static_branch_likely(&fgraph_do_direct)) {
 		if (test_bit(fgraph_direct_gops->idx, &bitmap))
 			static_call(fgraph_retfunc)(&trace, fgraph_direct_gops, fregs);
 	} else
@@ -1285,6 +1289,9 @@ static void ftrace_graph_enable_direct(bool enable_branch, struct fgraph_ops *go
 	trace_func_graph_ret_t retfunc = NULL;
 	int i;
 
+	if (FGRAPH_NO_DIRECT)
+		return;
+
 	if (gops) {
 		func = gops->entryfunc;
 		retfunc = gops->retfunc;
@@ -1308,6 +1315,9 @@ static void ftrace_graph_enable_direct(bool enable_branch, struct fgraph_ops *go
 
 static void ftrace_graph_disable_direct(bool disable_branch)
 {
+	if (FGRAPH_NO_DIRECT)
+		return;
+
 	if (disable_branch)
 		static_branch_disable(&fgraph_do_direct);
 	static_call_update(fgraph_func, ftrace_graph_entry_stub);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index d33103408955..1e7a34a31851 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1849,6 +1849,7 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
 	struct ring_buffer_event *event;
 	u64 ts, delta;
 	int events = 0;
+	int len;
 	int e;
 
 	*delta_ptr = 0;
@@ -1856,9 +1857,12 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
 
 	ts = dpage->time_stamp;
 
-	for (e = 0; e < tail; e += rb_event_length(event)) {
+	for (e = 0; e < tail; e += len) {
 
 		event = (struct ring_buffer_event *)(dpage->data + e);
+		len = rb_event_length(event);
+		if (len <= 0 || len > tail - e)
+			return -1;
 
 		switch (event->type_len) {
 
@@ -1919,6 +1923,8 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 	if (!meta || !meta->head_buffer)
 		return;
 
+	orig_head = head_page = cpu_buffer->head_page;
+
 	/* Do the reader page first */
 	ret = rb_validate_buffer(cpu_buffer->reader_page->page, cpu_buffer->cpu);
 	if (ret < 0) {
@@ -1929,7 +1935,6 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 	entry_bytes += local_read(&cpu_buffer->reader_page->page->commit);
 	local_set(&cpu_buffer->reader_page->entries, ret);
 
-	orig_head = head_page = cpu_buffer->head_page;
 	ts = head_page->page->time_stamp;
 
 	/*
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 61fe01dce7a6..b659653dc03a 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1311,6 +1311,9 @@ static void remove_event_file_dir(struct trace_event_file *file)
 	free_event_filter(file->filter);
 	file->flags |= EVENT_FILE_FL_FREED;
 	event_file_put(file);
+
+	/* Wake up hist poll waiters to notice the EVENT_FILE_FL_FREED flag. */
+	hist_poll_wakeup();
 }
 
 /*
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index e6f449f53afc..768df987419e 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -5784,7 +5784,7 @@ static __poll_t event_hist_poll(struct file *file, struct poll_table_struct *wai
 
 	guard(mutex)(&event_mutex);
 
-	event_file = event_file_data(file);
+	event_file = event_file_file(file);
 	if (!event_file)
 		return EPOLLERR;
 
@@ -5822,7 +5822,7 @@ static int event_hist_open(struct inode *inode, struct file *file)
 
 	guard(mutex)(&event_mutex);
 
-	event_file = event_file_data(file);
+	event_file = event_file_file(file);
 	if (!event_file) {
 		ret = -ENODEV;
 		goto err;