[PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED

Steven Rostedt posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
kernel/trace/trace_events.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
[PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Steven Rostedt 1 month, 2 weeks ago
From: Steven Rostedt <rostedt@goodmis.org>

When eventfs was introduced, special care had to be done to coordinate the
freeing of the file meta data with the files that are exposed to user
space. The file meta data would have a ref count that is set when the file
is created and would be decremented and freed after the last user that
opened the file closed it. When the file meta data was to be freed, it
would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed,
and any new references made (like new opens or reads) would fail as it is
marked freed. This allowed other meta data to be freed after this flag was
set (under the event_mutex).

All the files that were dynamically created in the events directory had a
pointer to the file meta data and would call event_release() when the last
reference to the user space file was closed. This would be the time that it
is safe to free the file meta data.

A short cut was made for the "format" file. It's i_private would point to
the "call" entry directly and not point to the file's meta data. This is
because all format files are the same for the same "call", so it was
thought there was no reason to differentiate them.  The other files
maintain state (like the "enable", "trigger", etc). But this meant if the
file were to disappear, the "format" file would be unaware of it.

This fixes two bugs in the same code. One is a race that could be trigger
via the user_events test (that would create dynamic events and free them),
and running a loop that would read the user_events format files:

In one console run:

 # cd tools/testing/selftests/user_events
 # while true; do ./ftrace_test; done

And in another console run:

 # cd /sys/kernel/tracing/
 # while true; do cat events/user_events/__test_event/format; done 2>/dev/null

With KASAN memory checking, it would trigger a use-after-free bug. This was
because the format file was not checking the file's meta data flag
"EVENT_FILE_FL_FREED", so it would access the event that the file meta data
pointed to after it was freed.

The second bug is that the dynamic "format" file also registered a callback
to decrement the meta data, but the "data" pointer passed to the callback
was the event itself. Not the meta data to free. This would either cause a
memory leak (the meta data never was freed) or a crash as it could have
incorrectly freed the event itself.

Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/

Cc: stable@vger.kernel.org
Reported-by: Mathias Krause <minipli@grsecurity.net>
Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 6ef29eba90ce..852643d957de 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1540,7 +1540,8 @@ enum {
 
 static void *f_next(struct seq_file *m, void *v, loff_t *pos)
 {
-	struct trace_event_call *call = event_file_data(m->private);
+	struct trace_event_file *file = event_file_data(m->private);
+	struct trace_event_call *call = file->event_call;
 	struct list_head *common_head = &ftrace_common_fields;
 	struct list_head *head = trace_get_fields(call);
 	struct list_head *node = v;
@@ -1572,7 +1573,8 @@ static void *f_next(struct seq_file *m, void *v, loff_t *pos)
 
 static int f_show(struct seq_file *m, void *v)
 {
-	struct trace_event_call *call = event_file_data(m->private);
+	struct trace_event_file *file = event_file_data(m->private);
+	struct trace_event_call *call = file->event_call;
 	struct ftrace_event_field *field;
 	const char *array_descriptor;
 
@@ -1627,12 +1629,14 @@ static int f_show(struct seq_file *m, void *v)
 
 static void *f_start(struct seq_file *m, loff_t *pos)
 {
+	struct trace_event_file *file;
 	void *p = (void *)FORMAT_HEADER;
 	loff_t l = 0;
 
 	/* ->stop() is called even if ->start() fails */
 	mutex_lock(&event_mutex);
-	if (!event_file_data(m->private))
+	file = event_file_data(m->private);
+	if (!file || (file->flags & EVENT_FILE_FL_FREED))
 		return ERR_PTR(-ENODEV);
 
 	while (l < *pos && p)
@@ -2485,7 +2489,6 @@ static int event_callback(const char *name, umode_t *mode, void **data,
 	if (strcmp(name, "format") == 0) {
 		*mode = TRACE_MODE_READ;
 		*fops = &ftrace_event_format_fops;
-		*data = call;
 		return 1;
 	}
 
-- 
2.43.0
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Masami Hiramatsu (Google) 1 month, 1 week ago
On Thu, 25 Jul 2024 20:15:17 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> When eventfs was introduced, special care had to be done to coordinate the
> freeing of the file meta data with the files that are exposed to user
> space. The file meta data would have a ref count that is set when the file
> is created and would be decremented and freed after the last user that
> opened the file closed it. When the file meta data was to be freed, it
> would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed,
> and any new references made (like new opens or reads) would fail as it is
> marked freed. This allowed other meta data to be freed after this flag was
> set (under the event_mutex).
> 
> All the files that were dynamically created in the events directory had a
> pointer to the file meta data and would call event_release() when the last
> reference to the user space file was closed. This would be the time that it
> is safe to free the file meta data.
> 
> A short cut was made for the "format" file. It's i_private would point to
> the "call" entry directly and not point to the file's meta data. This is
> because all format files are the same for the same "call", so it was
> thought there was no reason to differentiate them.  The other files
> maintain state (like the "enable", "trigger", etc). But this meant if the
> file were to disappear, the "format" file would be unaware of it.
> 
> This fixes two bugs in the same code. One is a race that could be trigger
> via the user_events test (that would create dynamic events and free them),
> and running a loop that would read the user_events format files:
> 
> In one console run:
> 
>  # cd tools/testing/selftests/user_events
>  # while true; do ./ftrace_test; done
> 
> And in another console run:
> 
>  # cd /sys/kernel/tracing/
>  # while true; do cat events/user_events/__test_event/format; done 2>/dev/null
> 
> With KASAN memory checking, it would trigger a use-after-free bug. This was
> because the format file was not checking the file's meta data flag
> "EVENT_FILE_FL_FREED", so it would access the event that the file meta data
> pointed to after it was freed.
> 
> The second bug is that the dynamic "format" file also registered a callback
> to decrement the meta data, but the "data" pointer passed to the callback
> was the event itself. Not the meta data to free. This would either cause a
> memory leak (the meta data never was freed) or a crash as it could have
> incorrectly freed the event itself.
> 
> Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
> 
> Cc: stable@vger.kernel.org
> Reported-by: Mathias Krause <minipli@grsecurity.net>
> Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Looks good to me. It introduces a chance of checking EVENT_FILE_FL_FREED
as same as other files.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks,

> ---
>  kernel/trace/trace_events.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 6ef29eba90ce..852643d957de 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1540,7 +1540,8 @@ enum {
>  
>  static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>  {
> -	struct trace_event_call *call = event_file_data(m->private);
> +	struct trace_event_file *file = event_file_data(m->private);
> +	struct trace_event_call *call = file->event_call;
>  	struct list_head *common_head = &ftrace_common_fields;
>  	struct list_head *head = trace_get_fields(call);
>  	struct list_head *node = v;
> @@ -1572,7 +1573,8 @@ static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>  
>  static int f_show(struct seq_file *m, void *v)
>  {
> -	struct trace_event_call *call = event_file_data(m->private);
> +	struct trace_event_file *file = event_file_data(m->private);
> +	struct trace_event_call *call = file->event_call;
>  	struct ftrace_event_field *field;
>  	const char *array_descriptor;
>  
> @@ -1627,12 +1629,14 @@ static int f_show(struct seq_file *m, void *v)
>  
>  static void *f_start(struct seq_file *m, loff_t *pos)
>  {
> +	struct trace_event_file *file;
>  	void *p = (void *)FORMAT_HEADER;
>  	loff_t l = 0;
>  
>  	/* ->stop() is called even if ->start() fails */
>  	mutex_lock(&event_mutex);
> -	if (!event_file_data(m->private))
> +	file = event_file_data(m->private);
> +	if (!file || (file->flags & EVENT_FILE_FL_FREED))
>  		return ERR_PTR(-ENODEV);
>  
>  	while (l < *pos && p)
> @@ -2485,7 +2489,6 @@ static int event_callback(const char *name, umode_t *mode, void **data,
>  	if (strcmp(name, "format") == 0) {
>  		*mode = TRACE_MODE_READ;
>  		*fops = &ftrace_event_format_fops;
> -		*data = call;
>  		return 1;
>  	}
>  
> -- 
> 2.43.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Ajay Kaher 1 month, 1 week ago
On Fri, Jul 26, 2024 at 5:45 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> From: Steven Rostedt <rostedt@goodmis.org>
>
> When eventfs was introduced, special care had to be done to coordinate the
> freeing of the file meta data with the files that are exposed to user
> space. The file meta data would have a ref count that is set when the file
> is created and would be decremented and freed after the last user that
> opened the file closed it. When the file meta data was to be freed, it
> would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed,
> and any new references made (like new opens or reads) would fail as it is
> marked freed. This allowed other meta data to be freed after this flag was
> set (under the event_mutex).
>
> All the files that were dynamically created in the events directory had a
> pointer to the file meta data and would call event_release() when the last
> reference to the user space file was closed. This would be the time that it
> is safe to free the file meta data.
>
> A short cut was made for the "format" file. It's i_private would point to
> the "call" entry directly and not point to the file's meta data. This is
> because all format files are the same for the same "call", so it was
> thought there was no reason to differentiate them.  The other files
> maintain state (like the "enable", "trigger", etc). But this meant if the
> file were to disappear, the "format" file would be unaware of it.
>
> This fixes two bugs in the same code. One is a race that could be trigger
> via the user_events test (that would create dynamic events and free them),
> and running a loop that would read the user_events format files:
>
> In one console run:
>
>  # cd tools/testing/selftests/user_events
>  # while true; do ./ftrace_test; done
>
> And in another console run:
>
>  # cd /sys/kernel/tracing/
>  # while true; do cat events/user_events/__test_event/format; done 2>/dev/null
>
> With KASAN memory checking, it would trigger a use-after-free bug. This was
> because the format file was not checking the file's meta data flag
> "EVENT_FILE_FL_FREED", so it would access the event that the file meta data
> pointed to after it was freed.
>
> The second bug is that the dynamic "format" file also registered a callback
> to decrement the meta data, but the "data" pointer passed to the callback
> was the event itself. Not the meta data to free. This would either cause a
> memory leak (the meta data never was freed) or a crash as it could have
> incorrectly freed the event itself.
>
> Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
>
> Cc: stable@vger.kernel.org
> Reported-by: Mathias Krause <minipli@grsecurity.net>
> Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_events.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 6ef29eba90ce..852643d957de 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1540,7 +1540,8 @@ enum {
>
>  static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>  {
> -       struct trace_event_call *call = event_file_data(m->private);
> +       struct trace_event_file *file = event_file_data(m->private);
> +       struct trace_event_call *call = file->event_call;
>         struct list_head *common_head = &ftrace_common_fields;
>         struct list_head *head = trace_get_fields(call);
>         struct list_head *node = v;
> @@ -1572,7 +1573,8 @@ static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>
>  static int f_show(struct seq_file *m, void *v)
>  {
> -       struct trace_event_call *call = event_file_data(m->private);
> +       struct trace_event_file *file = event_file_data(m->private);
> +       struct trace_event_call *call = file->event_call;
>         struct ftrace_event_field *field;
>         const char *array_descriptor;
>
> @@ -1627,12 +1629,14 @@ static int f_show(struct seq_file *m, void *v)
>
>  static void *f_start(struct seq_file *m, loff_t *pos)
>  {
> +       struct trace_event_file *file;
>         void *p = (void *)FORMAT_HEADER;
>         loff_t l = 0;
>
>         /* ->stop() is called even if ->start() fails */
>         mutex_lock(&event_mutex);
> -       if (!event_file_data(m->private))
> +       file = event_file_data(m->private);
> +       if (!file || (file->flags & EVENT_FILE_FL_FREED))
>                 return ERR_PTR(-ENODEV);

Some doubt:
Because of the same race condition, it may happen that kmem_cache_free(file)
was executed while f_start() is waiting to get event_mutex. Once
f_start() acquires
event_mutex, it will access the *file which points to the freed cache.
I am assuming in this case KASAN will not show anything as *file
belongs to cache.

- Ajay

>         while (l < *pos && p)
> @@ -2485,7 +2489,6 @@ static int event_callback(const char *name, umode_t *mode, void **data,
>         if (strcmp(name, "format") == 0) {
>                 *mode = TRACE_MODE_READ;
>                 *fops = &ftrace_event_format_fops;
> -               *data = call;
>                 return 1;
>         }
>
> --
> 2.43.0
>
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Steven Rostedt 1 month, 1 week ago
On Fri, 26 Jul 2024 18:00:18 +0530
Ajay Kaher <ajay.kaher@broadcom.com> wrote:

> Some doubt:
> Because of the same race condition, it may happen that kmem_cache_free(file)
> was executed while f_start() is waiting to get event_mutex. Once
> f_start() acquires
> event_mutex, it will access the *file which points to the freed cache.
> I am assuming in this case KASAN will not show anything as *file
> belongs to cache.

No, the file is freed by the callback from eventfs when the last reference
to the file is released. That is, there's no more references to the files
(nothing has it opened). As this code is only called when the file is
opened, it will not race with the freeing of the descriptor.

See event_create_dir(), it registers the dynamically created directory
and files. It will also do call event_file_get() that adds a reference
on this file/directory descriptor. It also registers the
"event_release" function to be called when the last reference of all
open files are closed in that directory.

That event_release() will call event_file_put() that does the final
release and frees the file. This prevents file from being freed while
anything has it opened.

While looking at this code I did realize that the "format" doesn't
register an "event_release" and there's no bug with its data pointing
to the call with respect to freeing something it shouldn't be. But it
still needs the file pointer anyway so that it can have access to its
flags.

-- Steve
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Ajay Kaher 1 month, 1 week ago
On Fri, Jul 26, 2024 at 9:33 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 26 Jul 2024 18:00:18 +0530
> Ajay Kaher <ajay.kaher@broadcom.com> wrote:
>
> > Some doubt:
> > Because of the same race condition, it may happen that kmem_cache_free(file)
> > was executed while f_start() is waiting to get event_mutex. Once
> > f_start() acquires
> > event_mutex, it will access the *file which points to the freed cache.
> > I am assuming in this case KASAN will not show anything as *file
> > belongs to cache.
>
> No, the file is freed by the callback from eventfs when the last reference
> to the file is released. That is, there's no more references to the files
> (nothing has it opened). As this code is only called when the file is
> opened, it will not race with the freeing of the descriptor.
>
> See event_create_dir(), it registers the dynamically created directory
> and files. It will also do call event_file_get() that adds a reference
> on this file/directory descriptor. It also registers the
> "event_release" function to be called when the last reference of all
> open files are closed in that directory.
>
> That event_release() will call event_file_put() that does the final
> release and frees the file. This prevents file from being freed while
> anything has it opened.
>
> While looking at this code I did realize that the "format" doesn't
> register an "event_release" and there's no bug with its data pointing
> to the call with respect to freeing something it shouldn't be. But it
> still needs the file pointer anyway so that it can have access to its
> flags.

Got it, VFS calls the callback tracefs_d_release() -> ... ->
kmem_cache_free(file). And this will happen on close of the last
reference.

Following is not related to this bug:
event_release callback executed once 'dir' closed (no more ref), any
specific reason to register with 'dir'/'enable' file. If not, could we
register with the 'dir' instead of 'enable'.

-Ajay
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Steven Rostedt 1 month, 1 week ago
On Mon, 29 Jul 2024 18:29:49 +0530
Ajay Kaher <ajay.kaher@broadcom.com> wrote:

> Following is not related to this bug:
> event_release callback executed once 'dir' closed (no more ref), any
> specific reason to register with 'dir'/'enable' file. If not, could we
> register with the 'dir' instead of 'enable'.

I tried that at first but it got messy. The files are saved in an array
for all files in the directory. There is just one array for all
directories that gets passed to the creation functions. That is, it
doesn't grow the memory footprint with the number of directories and
files created. By adding a callback for each file seemed a bit more
robust (and easier to implement as it only modified what the array
pointed to).

Although, I could change this array to a director structure that has a
callback for when it is destroyed and an array of all the files. That
may be the better approach but it caused a bigger change to the code as
it changed a lot of the prototypes.

So I ended up with the per file callback. But it could be updated if it
is appropriate.

-- Steve
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Mathias Krause 1 month, 1 week ago
On 26.07.24 02:15, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
> 
> When eventfs was introduced, special care had to be done to coordinate the
> freeing of the file meta data with the files that are exposed to user
> space. The file meta data would have a ref count that is set when the file
> is created and would be decremented and freed after the last user that
> opened the file closed it. When the file meta data was to be freed, it
> would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed,
> and any new references made (like new opens or reads) would fail as it is
> marked freed. This allowed other meta data to be freed after this flag was
> set (under the event_mutex).
> 
> All the files that were dynamically created in the events directory had a
> pointer to the file meta data and would call event_release() when the last
> reference to the user space file was closed. This would be the time that it
> is safe to free the file meta data.
> 
> A short cut was made for the "format" file. It's i_private would point to
> the "call" entry directly and not point to the file's meta data. This is
> because all format files are the same for the same "call", so it was
> thought there was no reason to differentiate them.  The other files
> maintain state (like the "enable", "trigger", etc). But this meant if the
> file were to disappear, the "format" file would be unaware of it.
> 
> This fixes two bugs in the same code. One is a race that could be trigger
> via the user_events test (that would create dynamic events and free them),
> and running a loop that would read the user_events format files:
> 
> In one console run:
> 
>  # cd tools/testing/selftests/user_events
>  # while true; do ./ftrace_test; done
> 
> And in another console run:
> 
>  # cd /sys/kernel/tracing/
>  # while true; do cat events/user_events/__test_event/format; done 2>/dev/null
> 
> With KASAN memory checking, it would trigger a use-after-free bug. This was

The UAF bug is there even without KASAN. It's just that KASAN makes it
much easier to detect and catch early.

> because the format file was not checking the file's meta data flag
> "EVENT_FILE_FL_FREED", so it would access the event that the file meta data
> pointed to after it was freed.
> 
> The second bug is that the dynamic "format" file also registered a callback
> to decrement the meta data, but the "data" pointer passed to the callback
> was the event itself. Not the meta data to free. This would either cause a
> memory leak (the meta data never was freed) or a crash as it could have
> incorrectly freed the event itself.
> 
> Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
> 
> Cc: stable@vger.kernel.org
> Reported-by: Mathias Krause <minipli@grsecurity.net>
> Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")

That fixes tag looks odd as it didn't introduce the bug. It's some late
change to v6.9 but my bisect run showed, it's triggering as early as in
v6.6 (commit 27152bceea1d ("eventfs: Move tracing/events to eventfs")).

git blame points to 5790b1fb3d67 ("eventfs: Remove eventfs_file and just
use eventfs_inode"), which is still too young, as it's v6.7.

IMHO, this needs at least the following additional fixes tags to ensure
all stable kernels get covered:

Fixes: 5790b1fb3d67 ("eventfs: Remove eventfs_file and just use
eventfs_inode")
Fixes: 27152bceea1d ("eventfs: Move tracing/events to eventfs")

Even if 27152bceea1d is not the real cause, just the commit making the
bug reachable. But from looking at the history, this was always wrong?

> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_events.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 6ef29eba90ce..852643d957de 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1540,7 +1540,8 @@ enum {
>  
>  static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>  {
> -	struct trace_event_call *call = event_file_data(m->private);
> +	struct trace_event_file *file = event_file_data(m->private);
> +	struct trace_event_call *call = file->event_call;
>  	struct list_head *common_head = &ftrace_common_fields;
>  	struct list_head *head = trace_get_fields(call);
>  	struct list_head *node = v;
> @@ -1572,7 +1573,8 @@ static void *f_next(struct seq_file *m, void *v, loff_t *pos)
>  
>  static int f_show(struct seq_file *m, void *v)
>  {
> -	struct trace_event_call *call = event_file_data(m->private);
> +	struct trace_event_file *file = event_file_data(m->private);
> +	struct trace_event_call *call = file->event_call;
>  	struct ftrace_event_field *field;
>  	const char *array_descriptor;
>  
> @@ -1627,12 +1629,14 @@ static int f_show(struct seq_file *m, void *v)
>  
>  static void *f_start(struct seq_file *m, loff_t *pos)
>  {
> +	struct trace_event_file *file;
>  	void *p = (void *)FORMAT_HEADER;
>  	loff_t l = 0;
>  
>  	/* ->stop() is called even if ->start() fails */
>  	mutex_lock(&event_mutex);
> -	if (!event_file_data(m->private))
> +	file = event_file_data(m->private);
> +	if (!file || (file->flags & EVENT_FILE_FL_FREED))
>  		return ERR_PTR(-ENODEV);
>  
>  	while (l < *pos && p)
> @@ -2485,7 +2489,6 @@ static int event_callback(const char *name, umode_t *mode, void **data,
>  	if (strcmp(name, "format") == 0) {
>  		*mode = TRACE_MODE_READ;
>  		*fops = &ftrace_event_format_fops;
> -		*data = call;
>  		return 1;
>  	}
>  

Tested-by: Mathias Krause <minipli@grsecurity.net>

Thanks,
Mathias
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Steven Rostedt 1 month, 1 week ago
On Fri, 26 Jul 2024 12:16:16 +0200
Mathias Krause <minipli@grsecurity.net> wrote:

> > 
> > With KASAN memory checking, it would trigger a use-after-free bug. This was  
> 
> The UAF bug is there even without KASAN. It's just that KASAN makes it
> much easier to detect and catch early.

Well the bug happens without KASAN but the "tigger" is shown by KASAN.
I was assuming people understood that.

> 
> > because the format file was not checking the file's meta data flag
> > "EVENT_FILE_FL_FREED", so it would access the event that the file meta data
> > pointed to after it was freed.
> > 
> > The second bug is that the dynamic "format" file also registered a callback
> > to decrement the meta data, but the "data" pointer passed to the callback
> > was the event itself. Not the meta data to free. This would either cause a
> > memory leak (the meta data never was freed) or a crash as it could have
> > incorrectly freed the event itself.

I need to remove the above, as I realized the release callback doesn't
get called for the "filter" but for only the "enable". That doesn't get
called until all files have no more references. So there's only one bug
here.

> > 
> > Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
> > 
> > Cc: stable@vger.kernel.org
> > Reported-by: Mathias Krause <minipli@grsecurity.net>
> > Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")  
> 
> That fixes tag looks odd as it didn't introduce the bug. It's some late
> change to v6.9 but my bisect run showed, it's triggering as early as in
> v6.6 (commit 27152bceea1d ("eventfs: Move tracing/events to eventfs")).
> 
> git blame points to 5790b1fb3d67 ("eventfs: Remove eventfs_file and just
> use eventfs_inode"), which is still too young, as it's v6.7.

But if you look at the commit I posted. It has:

  Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")

And you need to add that to apply this patch as it has that as the
dependency. If you try to apply this to the change that had the
original bug, it will not apply. I basically say that this patch is a
fix to the previous fix.

> 
> IMHO, this needs at least the following additional fixes tags to ensure
> all stable kernels get covered:
> 
> Fixes: 5790b1fb3d67 ("eventfs: Remove eventfs_file and just use
> eventfs_inode")
> Fixes: 27152bceea1d ("eventfs: Move tracing/events to eventfs")
> 
> Even if 27152bceea1d is not the real cause, just the commit making the
> bug reachable. But from looking at the history, this was always wrong?

All stable kernels should get covered as 27152bceea1d has both a Cc
stable tag and a Fixes tag for 5790b1fb3d67. And the stable kernels
look at what commits have been backported to determine what other
commits should be backported. By saying this fixes 27152bceea1d, it
should all work out correctly.

> 
> > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > ---
> >  kernel/trace/trace_events.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 6ef29eba90ce..852643d957de 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -1540,7 +1540,8 @@ enum {
> >  
> >  static void *f_next(struct seq_file *m, void *v, loff_t *pos)
> >  {
> > -	struct trace_event_call *call = event_file_data(m->private);
> > +	struct trace_event_file *file = event_file_data(m->private);
> > +	struct trace_event_call *call = file->event_call;
> >  	struct list_head *common_head = &ftrace_common_fields;
> >  	struct list_head *head = trace_get_fields(call);
> >  	struct list_head *node = v;
> > @@ -1572,7 +1573,8 @@ static void *f_next(struct seq_file *m, void *v, loff_t *pos)
> >  
> >  static int f_show(struct seq_file *m, void *v)
> >  {
> > -	struct trace_event_call *call = event_file_data(m->private);
> > +	struct trace_event_file *file = event_file_data(m->private);
> > +	struct trace_event_call *call = file->event_call;
> >  	struct ftrace_event_field *field;
> >  	const char *array_descriptor;
> >  
> > @@ -1627,12 +1629,14 @@ static int f_show(struct seq_file *m, void *v)
> >  
> >  static void *f_start(struct seq_file *m, loff_t *pos)
> >  {
> > +	struct trace_event_file *file;
> >  	void *p = (void *)FORMAT_HEADER;
> >  	loff_t l = 0;
> >  
> >  	/* ->stop() is called even if ->start() fails */
> >  	mutex_lock(&event_mutex);
> > -	if (!event_file_data(m->private))
> > +	file = event_file_data(m->private);
> > +	if (!file || (file->flags & EVENT_FILE_FL_FREED))
> >  		return ERR_PTR(-ENODEV);
> >  
> >  	while (l < *pos && p)
> > @@ -2485,7 +2489,6 @@ static int event_callback(const char *name, umode_t *mode, void **data,
> >  	if (strcmp(name, "format") == 0) {
> >  		*mode = TRACE_MODE_READ;
> >  		*fops = &ftrace_event_format_fops;
> > -		*data = call;
> >  		return 1;
> >  	}
> >    
> 
> Tested-by: Mathias Krause <minipli@grsecurity.net>

Thanks!

-- Steve
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Mathias Krause 1 month, 1 week ago
On 26.07.24 16:52, Steven Rostedt wrote:
> On Fri, 26 Jul 2024 12:16:16 +0200
> Mathias Krause <minipli@grsecurity.net> wrote:
> 
>>>
>>> With KASAN memory checking, it would trigger a use-after-free bug. This was  
>>
>> The UAF bug is there even without KASAN. It's just that KASAN makes it
>> much easier to detect and catch early.
> 
> Well the bug happens without KASAN but the "tigger" is shown by KASAN.
> I was assuming people understood that.

Ahh, okay. I'd written "report" instead of "trigger" to make it less
ambiguous.

>>> because the format file was not checking the file's meta data flag
>>> "EVENT_FILE_FL_FREED", so it would access the event that the file meta data
>>> pointed to after it was freed.
>>>
>>> The second bug is that the dynamic "format" file also registered a callback
>>> to decrement the meta data, but the "data" pointer passed to the callback
>>> was the event itself. Not the meta data to free. This would either cause a
>>> memory leak (the meta data never was freed) or a crash as it could have
>>> incorrectly freed the event itself.
> 
> I need to remove the above, as I realized the release callback doesn't
> get called for the "filter" but for only the "enable". That doesn't get
> called until all files have no more references. So there's only one bug
> here.

Yeah, all files are covered by the same event_file and only "enable"
does the final put.

>>>
>>> Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
>>>
>>> Cc: stable@vger.kernel.org
>>> Reported-by: Mathias Krause <minipli@grsecurity.net>
>>> Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")  
>>
>> That fixes tag looks odd as it didn't introduce the bug. It's some late
>> change to v6.9 but my bisect run showed, it's triggering as early as in
>> v6.6 (commit 27152bceea1d ("eventfs: Move tracing/events to eventfs")).
>>
>> git blame points to 5790b1fb3d67 ("eventfs: Remove eventfs_file and just
>> use eventfs_inode"), which is still too young, as it's v6.7.
> 
> But if you look at the commit I posted. It has:
> 
>   Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
> 
> And you need to add that to apply this patch as it has that as the
> dependency. If you try to apply this to the change that had the
> original bug, it will not apply. I basically say that this patch is a
> fix to the previous fix.

So far so good.

>>
>> IMHO, this needs at least the following additional fixes tags to ensure
>> all stable kernels get covered:
>>
>> Fixes: 5790b1fb3d67 ("eventfs: Remove eventfs_file and just use
>> eventfs_inode")
>> Fixes: 27152bceea1d ("eventfs: Move tracing/events to eventfs")
>>
>> Even if 27152bceea1d is not the real cause, just the commit making the
>> bug reachable. But from looking at the history, this was always wrong?
> 
> All stable kernels should get covered as 27152bceea1d has both a Cc
> stable tag and a Fixes tag for 5790b1fb3d67. And the stable kernels
> look at what commits have been backported to determine what other
> commits should be backported.

Now you lost me. Neither has 27152bceea1d a Cc stable tag, nor a Fixes
tag for 5790b1fb3d67. It simply cannot, because it's from July 2023
(v6.6-rc1) and 5790b1fb3d67 is from October 2024 (v6.7-rc1).

>                               By saying this fixes 27152bceea1d, it
> should all work out correctly.

That would be fine with me, as that's what my git bisect run pointed at
as well -- the oldest commit triggering the bug. However, in your v2
it's still b63db58e2fa5d (which has a Fixes tag for 5790b1fb3d672 but
not 27152bceea1d) which would suggest only kernels down to v6.7 are
affected.

Thanks,
Mathias
Re: [PATCH] tracing: Have format file honor EVENT_FILE_FL_FREED
Posted by Steven Rostedt 1 month, 1 week ago
On Fri, 26 Jul 2024 21:58:30 +0200
Mathias Krause <minipli@grsecurity.net> wrote:
> >>
> >> IMHO, this needs at least the following additional fixes tags to ensure
> >> all stable kernels get covered:
> >>
> >> Fixes: 5790b1fb3d67 ("eventfs: Remove eventfs_file and just use
> >> eventfs_inode")
> >> Fixes: 27152bceea1d ("eventfs: Move tracing/events to eventfs")
> >>
> >> Even if 27152bceea1d is not the real cause, just the commit making the
> >> bug reachable. But from looking at the history, this was always wrong?  
> > 
> > All stable kernels should get covered as 27152bceea1d has both a Cc
> > stable tag and a Fixes tag for 5790b1fb3d67. And the stable kernels
> > look at what commits have been backported to determine what other
> > commits should be backported.  
> 
> Now you lost me. Neither has 27152bceea1d a Cc stable tag, nor a Fixes
> tag for 5790b1fb3d67. It simply cannot, because it's from July 2023
> (v6.6-rc1) and 5790b1fb3d67 is from October 2024 (v6.7-rc1).

I'm juggling too many things around. I was thinking that 27152bceea1d
was b63db58e2fa5d. My mistake.

> 
> >                               By saying this fixes 27152bceea1d, it
> > should all work out correctly.  
> 
> That would be fine with me, as that's what my git bisect run pointed at
> as well -- the oldest commit triggering the bug. However, in your v2
> it's still b63db58e2fa5d (which has a Fixes tag for 5790b1fb3d672 but
> not 27152bceea1d) which would suggest only kernels down to v6.7 are
> affected.

OK, I see what your saying. So the bug is present with 27152bceea1d,
but so are a lot of other bugs. This was completely rewritten with the help
from Linus, and an effort was made to backport it all to 6.6.

  https://lore.kernel.org/all/20240206120905.570408983@rostedt.homelinux.com/

The above includes the 5790b1fb3d672 commit. Which is why this is the
commit I labeled as the main patch to backport to.

-- Steve