[GIT PULL] tracing: A few more fixes for 6.7

Steven Rostedt posted 1 patch 1 year, 12 months ago
fs/tracefs/event_inode.c            | 12 +++++++++---
kernel/trace/synth_event_gen_test.c | 11 +++++++++++
kernel/trace/trace_events_synth.c   |  4 ++--
3 files changed, 22 insertions(+), 5 deletions(-)
[GIT PULL] tracing: A few more fixes for 6.7
Posted by Steven Rostedt 1 year, 12 months ago

Linus,

Tracing fixes for 6.7:

- Fix another kerneldoc warning

- Fix eventfs files to inherit the ownership of its parent directory.
  The dynamic creating of dentries in eventfs did not take into
  account if the tracefs file system was mounted with a gid/uid,
  and would still default to the gid/uid of root. This is a regression.

- Fix warning when synthetic event testing is enabled along with
  startup event tracing testing is enabled


Please pull the latest trace-v6.7-rc6-2 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
trace-v6.7-rc6-2

Tag SHA1: 075832106638b57ad8e9b24891fec6d49aec5258
Head SHA1: 88b30c7f5d27e1594d70dc2bd7199b18f2b57fa9


Randy Dunlap (1):
      tracing/synthetic: fix kernel-doc warnings

Steven Rostedt (Google) (2):
      eventfs: Have event files and directories default to parent uid and gid
      tracing / synthetic: Disable events after testing in synth_event_gen_test_init()

----
 fs/tracefs/event_inode.c            | 12 +++++++++---
 kernel/trace/synth_event_gen_test.c | 11 +++++++++++
 kernel/trace/trace_events_synth.c   |  4 ++--
 3 files changed, 22 insertions(+), 5 deletions(-)
---------------------------
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 43e237864a42..2ccc849a5bda 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -148,7 +148,8 @@ static const struct file_operations eventfs_file_operations = {
 	.release	= eventfs_release,
 };
 
-static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, umode_t mode)
+static void update_inode_attr(struct dentry *dentry, struct inode *inode,
+			      struct eventfs_attr *attr, umode_t mode)
 {
 	if (!attr) {
 		inode->i_mode = mode;
@@ -162,9 +163,13 @@ static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, um
 
 	if (attr->mode & EVENTFS_SAVE_UID)
 		inode->i_uid = attr->uid;
+	else
+		inode->i_uid = d_inode(dentry->d_parent)->i_uid;
 
 	if (attr->mode & EVENTFS_SAVE_GID)
 		inode->i_gid = attr->gid;
+	else
+		inode->i_gid = d_inode(dentry->d_parent)->i_gid;
 }
 
 /**
@@ -206,7 +211,7 @@ static struct dentry *create_file(const char *name, umode_t mode,
 		return eventfs_failed_creating(dentry);
 
 	/* If the user updated the directory's attributes, use them */
-	update_inode_attr(inode, attr, mode);
+	update_inode_attr(dentry, inode, attr, mode);
 
 	inode->i_op = &eventfs_file_inode_operations;
 	inode->i_fop = fop;
@@ -242,7 +247,8 @@ static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry *parent
 		return eventfs_failed_creating(dentry);
 
 	/* If the user updated the directory's attributes, use them */
-	update_inode_attr(inode, &ei->attr, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
+	update_inode_attr(dentry, inode, &ei->attr,
+			  S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
 
 	inode->i_op = &eventfs_root_dir_inode_operations;
 	inode->i_fop = &eventfs_file_operations;
diff --git a/kernel/trace/synth_event_gen_test.c b/kernel/trace/synth_event_gen_test.c
index 8dfe85499d4a..354c2117be43 100644
--- a/kernel/trace/synth_event_gen_test.c
+++ b/kernel/trace/synth_event_gen_test.c
@@ -477,6 +477,17 @@ static int __init synth_event_gen_test_init(void)
 
 	ret = test_trace_synth_event();
 	WARN_ON(ret);
+
+	/* Disable when done */
+	trace_array_set_clr_event(gen_synth_test->tr,
+				  "synthetic",
+				  "gen_synth_test", false);
+	trace_array_set_clr_event(empty_synth_test->tr,
+				  "synthetic",
+				  "empty_synth_test", false);
+	trace_array_set_clr_event(create_synth_test->tr,
+				  "synthetic",
+				  "create_synth_test", false);
  out:
 	return ret;
 }
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 846e02c0fb59..e7af286af4f1 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1137,7 +1137,7 @@ EXPORT_SYMBOL_GPL(synth_event_add_fields);
  * @cmd: A pointer to the dynevent_cmd struct representing the new event
  * @name: The name of the synthetic event
  * @mod: The module creating the event, NULL if not created from a module
- * @args: Variable number of arg (pairs), one pair for each field
+ * @...: Variable number of arg (pairs), one pair for each field
  *
  * NOTE: Users normally won't want to call this function directly, but
  * rather use the synth_event_gen_cmd_start() wrapper, which
@@ -1695,7 +1695,7 @@ __synth_event_trace_end(struct synth_event_trace_state *trace_state)
  * synth_event_trace - Trace a synthetic event
  * @file: The trace_event_file representing the synthetic event
  * @n_vals: The number of values in vals
- * @args: Variable number of args containing the event values
+ * @...: Variable number of args containing the event values
  *
  * Trace a synthetic event using the values passed in the variable
  * argument list.
Re: [GIT PULL] tracing: A few more fixes for 6.7
Posted by pr-tracker-bot@kernel.org 1 year, 12 months ago
The pull request you sent on Thu, 21 Dec 2023 10:27:03 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git trace-v6.7-rc6-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/13b734465a9d1cd09551d52eb5faf5fe55e6a9ea

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] tracing: A few more fixes for 6.7
Posted by Linus Torvalds 1 year, 12 months ago
On Thu, 21 Dec 2023 at 07:26, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> - Fix eventfs files to inherit the ownership of its parent directory.
>   The dynamic creating of dentries in eventfs did not take into
>   account if the tracefs file system was mounted with a gid/uid,
>   and would still default to the gid/uid of root. This is a regression.

Honestly, this seems to still be entirely buggy. In fact, it looks
buggy in two different ways:

 (a) if 'attr' is NULL, none of this logic is triggered, and uid/gid
is still left as root despite the explicit mount options

 (b) if somebody has done a chown/gid on the directory, the new
dynamic creation logic seems to create any files inside that directory
with the new uid/gid.

Maybe (a) cannot happen, but that code in update_inode_attr() does
have a check for a NULL attr, so either it can happen, or that check
is bogus.

And (b) just looks messy.  Maybe you've disallowed chown/chgid on
tracefs, I didn't check. But why would it inherit the parent uid/gid?
That just doesn't seem to make any sense at all.

I still claim that the whole dynamic ftrace stuff was a huge mistake,
and that the real solution should always have been to just use one
single inode for every file (and use that 'attr' that you track and
the '->getattr()' callback to make them all *look* different to
users).

               Linus
Re: [GIT PULL] tracing: A few more fixes for 6.7
Posted by Steven Rostedt 1 year, 12 months ago
On Thu, 21 Dec 2023 09:45:43 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 21 Dec 2023 at 07:26, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > - Fix eventfs files to inherit the ownership of its parent directory.
> >   The dynamic creating of dentries in eventfs did not take into
> >   account if the tracefs file system was mounted with a gid/uid,
> >   and would still default to the gid/uid of root. This is a regression.  
> 
> Honestly, this seems to still be entirely buggy. In fact, it looks
> buggy in two different ways:
> 
>  (a) if 'attr' is NULL, none of this logic is triggered, and uid/gid
> is still left as root despite the explicit mount options
> 
>  (b) if somebody has done a chown/gid on the directory, the new
> dynamic creation logic seems to create any files inside that directory
> with the new uid/gid.
> 
> Maybe (a) cannot happen, but that code in update_inode_attr() does
> have a check for a NULL attr, so either it can happen, or that check
> is bogus.
> 
> And (b) just looks messy.  Maybe you've disallowed chown/chgid on
> tracefs, I didn't check. But why would it inherit the parent uid/gid?
> That just doesn't seem to make any sense at all.

I see you pulled the change already, but you are correct with the above two
points.

(a) can happen for files, I only tested directories and forgot to check the
    files. I verified that it's still broken for them:

 # umount /sys/kernel/tracing
 # mount -o rw,seclabel,relatime,gid=1000 -t tracefs nodev /mnt

 # ls -ld /mnt/events/sched
drwxr-xr-x 28 root rostedt 0 Dec 21 13:12 /mnt/events/sched/

 # ls -ld /mnt/events/sched/sched_switch
drwxr-xr-x 2 root rostedt 0 Dec 21 13:12 /mnt/events/sched/sched_switch/

But when checking the files:

 # ls -l /mnt/events/sched/sched_switch
total 0
-rw-r----- 1 root root 0 Dec 21 13:12 enable
-rw-r----- 1 root root 0 Dec 21 13:12 filter
-r--r----- 1 root root 0 Dec 21 13:12 format
-r--r----- 1 root root 0 Dec 21 13:12 hist
-r--r----- 1 root root 0 Dec 21 13:12 id
-rw-r----- 1 root root 0 Dec 21 13:12 trigger

so the files still need to be fixed.

And I agree, that using the directory parent was not a good idea. I took
that from the normal creation of the tracefs files which copy the parent
uid/gid, which makes sense as they are all created together, but does not
make sense when created dynamically. In that case, it needs to default to
what the superblock was mounted as.

Luckily, that's easy to get to. All I need to do is:

static void update_inode_attr(struct dentry *dentry, struct inode *inode,
			      struct eventfs_attr *attr, umode_t mode)
{
	struct tracefs_fs_info *fsi = dentry->d_sb->s_fs_info;
	struct tracefs_mount_opts *opts = &fsi->mount_opts;

	/* Default the ownership to what it was mounted as */
	inode->i_uid = opts->uid;
	inode->i_gid = opts->gid;

	if (!attr) {
		inode->i_mode = mode;
		return;
	}

	if (attr->mode & EVENTFS_SAVE_MODE)
		inode->i_mode = attr->mode & EVENTFS_MODE_MASK;
	else
		inode->i_mode = mode;

	if (attr->mode & EVENTFS_SAVE_UID)
		inode->i_uid = attr->uid;

	if (attr->mode & EVENTFS_SAVE_GID)
		inode->i_gid = attr->gid;
}

And that would solve both issues.

 # umount /sys/kernel/tracing
 # mount -o rw,seclabel,relatime,gid=1000 -t tracefs nodev /mnt

 # ls -ld /mnt/events/sched
drwxr-xr-x 28 root rostedt 0 Dec 21 13:12 /mnt/events/sched/

 # ls -ld /mnt/events/sched/sched_switch
drwxr-xr-x 2 root rostedt 0 Dec 21 13:12 /mnt/events/sched/sched_switch/

 # ls -l /mnt/events/sched/sched_switch
total 0
-rw-r----- 1 root rostedt 0 Dec 21 14:15 enable
-rw-r----- 1 root rostedt 0 Dec 21 14:15 filter
-r--r----- 1 root rostedt 0 Dec 21 14:15 format
-r--r----- 1 root rostedt 0 Dec 21 14:15 hist
-r--r----- 1 root rostedt 0 Dec 21 14:15 hist_debug
-r--r----- 1 root rostedt 0 Dec 21 14:15 id
--w------- 1 root rostedt 0 Dec 21 14:15 inject
-rw-r----- 1 root rostedt 0 Dec 21 14:15 trigger

And:

 # chgrp tracing /mnt/events/irq/softirq_entry
 # ls -l /mnt/events/irq/softirq_entry
total 0
-rw-r----- 1 root rostedt 0 Dec 21 14:16 enable
-rw-r----- 1 root rostedt 0 Dec 21 14:16 filter
-r--r----- 1 root rostedt 0 Dec 21 14:16 format
-r--r----- 1 root rostedt 0 Dec 21 14:16 hist
-r--r----- 1 root rostedt 0 Dec 21 14:16 hist_debug
-r--r----- 1 root rostedt 0 Dec 21 14:16 id
--w------- 1 root rostedt 0 Dec 21 14:16 inject
-rw-r----- 1 root rostedt 0 Dec 21 14:16 trigger


> 
> I still claim that the whole dynamic ftrace stuff was a huge mistake,
> and that the real solution should always have been to just use one
> single inode for every file (and use that 'attr' that you track and
> the '->getattr()' callback to make them all *look* different to
> users).

Files now do not even have meta-data, and that saved 2 megs per trace
instance. I only keep meta data for the directories. The files themselves
are created via callback functions.

The files in an event directory are all the same for each event. But the
events themselves need to maintain state per instance. A sched_switch event
may be enabled in one instance and not another. That's stored in the
trace_event_file structure that's allocated per event per instance.

The dentry and inodes are allocated when the files are referenced, via
callback functions that pass a descriptor that maps the director to the
trace_event_file to give the attributes needed.

I'm not sure a single inode would have prevented this bug, as it would
still need to have this type of accounting.

Anyway, what I have to do next after fixing this, is to add a selftest that
tests the chown/chgrp on eventfs directories and files.

Thanks for catching this!

-- Steve
Re: [GIT PULL] tracing: A few more fixes for 6.7
Posted by Linus Torvalds 1 year, 12 months ago
On Thu, 21 Dec 2023 at 11:27, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Luckily, that's easy to get to. All I need to do is:
>
> static void update_inode_attr(struct dentry *dentry, struct inode *inode,
>                               struct eventfs_attr *attr, umode_t mode)
> {
>         struct tracefs_fs_info *fsi = dentry->d_sb->s_fs_info;
>         struct tracefs_mount_opts *opts = &fsi->mount_opts;
>
>         /* Default the ownership to what it was mounted as */
>         inode->i_uid = opts->uid;
>         inode->i_gid = opts->gid;

I think you should add

>         inode->i_mode = mode;

to that "default setup", which not only makes things more consistent,
it also means that you can then remove it from here:

>         if (!attr) {
>                 inode->i_mode = mode;
>                 return;
>         }

.. and the 'else' side from here:

>         if (attr->mode & EVENTFS_SAVE_MODE)
>                 inode->i_mode = attr->mode & EVENTFS_MODE_MASK;
>         else
>                 inode->i_mode = mode;

and it all looks a lot more clear and obvious.

"Set things to default values, then if we have attr and the specific
fields are set in those attrs, update them".

Instead of having this odd "do one thing for git/uid, another for mode".

> > I still claim that the whole dynamic ftrace stuff was a huge mistake,
> > and that the real solution should always have been to just use one
> > single inode for every file (and use that 'attr' that you track and
> > the '->getattr()' callback to make them all *look* different to
> > users).
>
> Files now do not even have meta-data, and that saved 2 megs per trace
> instance. I only keep meta data for the directories. The files themselves
> are created via callback functions.

I bet that was basically *all* just the inodes.

The dentries take up very little space, and the fact that you didn't
keep the dentries around meant that you instead replaced them with
that 'struct eventfs_file' which probably takes up as much room as the
dentries ever did - and now when you use them, you obviously use
*more* memory since it duplicates the data in the dentries, including
the name etc.

So I bet you use *more* memory than if you just kept the dentry tree
around, and this dynamic creation has then caused a number of bugs and
a lot of extra complexity - things like having to re-implement your
own readdir() etc, much of which has been buggy.

And when you fix the resulting bugs, the end result is often
disgusting. I'm talking about things like commit ef36b4f92868
("eventfs: Remember what dentries were created on dir open"), which
does things like re-use file->private_data for two entirely different
things (is it a 'cursor' or a 'dlist'? Who can know? That thing makes
me gag).

Honestly, that was just one example of "that code does some truly ugly
things because the whole notion is mis-designed".

            Linus
Re: [GIT PULL] tracing: A few more fixes for 6.7
Posted by Steven Rostedt 1 year, 12 months ago
On Thu, 21 Dec 2023 12:01:45 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 21 Dec 2023 at 11:27, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Luckily, that's easy to get to. All I need to do is:

Note, this actually still has a broken corner case. And that would be:

[ No setting of ownership on mount ]

 # chown rostedt:rostedt instances
 # mkdir instance/foo

 # ls -l instances/foo
total 0
-r--r-----  1 rostedt rostedt 0 Dec 21 14:55 available_tracers
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 buffer_percent
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 buffer_size_kb
-r--r-----  1 rostedt rostedt 0 Dec 21 14:55 buffer_total_size_kb
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 current_tracer
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 error_log
drwxr-xr-x  2 root    root    0 Dec 21 14:55 events  <<<---- not updated, nor is the children.
--w-------  1 rostedt rostedt 0 Dec 21 14:55 free_buffer
drwxr-x---  2 rostedt rostedt 0 Dec 21 14:55 options
drwxr-x--- 10 rostedt rostedt 0 Dec 21 14:55 per_cpu
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_event
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_event_notrace_pid
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_event_pid
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_ftrace_filter
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_ftrace_notrace
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_ftrace_notrace_pid
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 set_ftrace_pid
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 snapshot
-r--r-----  1 rostedt rostedt 0 Dec 21 14:55 timestamp_mode
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 trace
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 trace_clock
--w--w----  1 rostedt rostedt 0 Dec 21 14:55 trace_marker
--w--w----  1 rostedt rostedt 0 Dec 21 14:55 trace_marker_raw
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 trace_options
-r--r-----  1 rostedt rostedt 0 Dec 21 14:55 trace_pipe
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 tracing_cpumask
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 tracing_max_latency
-rw-r-----  1 rostedt rostedt 0 Dec 21 14:55 tracing_on

The events directory did not take on the parent.

And even if it did, the files and directories created under it needs to
default to what the events directory was created with, not what the mount.

That's not too hard to fix either. We just need to save that information in
the eventfs_inode of the events directory, and find it when creating other
files and directories.

> >
> > static void update_inode_attr(struct dentry *dentry, struct inode *inode,
> >                               struct eventfs_attr *attr, umode_t mode)
> > {
> >         struct tracefs_fs_info *fsi = dentry->d_sb->s_fs_info;
> >         struct tracefs_mount_opts *opts = &fsi->mount_opts;
> >
> >         /* Default the ownership to what it was mounted as */
> >         inode->i_uid = opts->uid;
> >         inode->i_gid = opts->gid;  
> 
> I think you should add
> 
> >         inode->i_mode = mode;  
> 
> to that "default setup", which not only makes things more consistent,
> it also means that you can then remove it from here:
> 
> >         if (!attr) {
> >                 inode->i_mode = mode;
> >                 return;
> >         }  
> 
> .. and the 'else' side from here:
> 
> >         if (attr->mode & EVENTFS_SAVE_MODE)
> >                 inode->i_mode = attr->mode & EVENTFS_MODE_MASK;
> >         else
> >                 inode->i_mode = mode;  
> 
> and it all looks a lot more clear and obvious.

sure.

> 
> "Set things to default values, then if we have attr and the specific
> fields are set in those attrs, update them".
> 
> Instead of having this odd "do one thing for git/uid, another for mode".
> 
> > > I still claim that the whole dynamic ftrace stuff was a huge mistake,
> > > and that the real solution should always have been to just use one
> > > single inode for every file (and use that 'attr' that you track and
> > > the '->getattr()' callback to make them all *look* different to
> > > users).  
> >
> > Files now do not even have meta-data, and that saved 2 megs per trace
> > instance. I only keep meta data for the directories. The files themselves
> > are created via callback functions.  
> 
> I bet that was basically *all* just the inodes.
> 
> The dentries take up very little space, and the fact that you didn't
> keep the dentries around meant that you instead replaced them with
> that 'struct eventfs_file' which probably takes up as much room as the
> dentries ever did - and now when you use them, you obviously use
> *more* memory since it duplicates the data in the dentries, including
> the name etc.

In 6.7 I got rid of the eventfs_file and friends.

  5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")

> 
> So I bet you use *more* memory than if you just kept the dentry tree
> around, and this dynamic creation has then caused a number of bugs and
> a lot of extra complexity - things like having to re-implement your
> own readdir() etc, much of which has been buggy.

As I'm new to VFS I expected to have a lot of bugs, but those bugs are
settling down now.

> 
> And when you fix the resulting bugs, the end result is often
> disgusting. I'm talking about things like commit ef36b4f92868
> ("eventfs: Remember what dentries were created on dir open"), which
> does things like re-use file->private_data for two entirely different
> things (is it a 'cursor' or a 'dlist'? Who can know? That thing makes
> me gag).

As I stated back then, I could get rid of the re-use of private_data by
creating my own cursor, but then I would need to reimplement the code to
iterate. Ideally, the readdir code should not have touched the
private_data, as now the file system has no private data to use.

> 
> Honestly, that was just one example of "that code does some truly ugly
> things because the whole notion is mis-designed".

I would argue the implementation of readdir that couples itself with the
private_data of the descriptor it is implementing the walk from was a
mis-design. I had no way to pass information from the open to the release
function without re-implementing all of readdir.

-- Steve