From nobody Thu Dec 25 10:30:23 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03CDD20334 for ; Wed, 17 Jan 2024 14:36:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; cv=none; b=FCnPZXKvLa1KoPAoXkBtdcYlScU5QXRZVMoXJxf85jg9OOM5u34Vdsvkhua1cG/+R0AtIuhV0CrIPZjGDheVBUAmBbB6oZVvUqFV+lOVgfdgg4hs4v7K2RNbQIbUE9tBK7obXKaegC0v+EHUWG4EbyVNc7d7L8svzEfo/YspCoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; c=relaxed/simple; bh=+l/2J1wuNvwwox8H/9diOMBy2Vp1lI/iE/4NnUV0Ybw=; h=Received:Received:Message-ID:User-Agent:Date:From:To:Cc:Subject: References:MIME-Version:Content-Type; b=FZNMTsUO2lvJrMLYd/9JDY4wuxXJ9/MB8Yfkb217f+wgowVEs/9Q4YMgoA4HKRXR3PnXU8g+/Bg5+m53nx4R3WC1/LgK3LoDJ71FZM/5A3id7/czR6lf+WgRrSWNxjeidDqWxT3yMsy0lCpOip8KsPlVNsxyXQzb89usGYjlxk0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9275BC43399; Wed, 17 Jan 2024 14:36:53 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.97) (envelope-from ) id 1rQ73S-00000001XY4-2pb8; Wed, 17 Jan 2024 09:38:10 -0500 Message-ID: <20240117143810.531966508@goodmis.org> User-Agent: quilt/0.67 Date: Wed, 17 Jan 2024 09:35:49 -0500 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Christian Brauner , Al Viro , Ajay Kaher , Linus Torvalds Subject: [for-linus][PATCH 1/3] eventfs: Have the inodes all for files and directories all be the same References: <20240117143548.595884070@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Steven Rostedt (Google)" The dentries and inodes are created in the readdir for the sole purpose of getting a consistent inode number. Linus stated that is unnecessary, and that all inodes can have the same inode number. For a virtual file system they are pretty meaningless. Instead use a single unique inode number for all files and one for all directories. Link: https://lore.kernel.org/all/20240116133753.2808d45e@gandalf.local.hom= e/ Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.412180363@g= oodmis.org Cc: Masami Hiramatsu Cc: Mark Rutland Cc: Mathieu Desnoyers Cc: Christian Brauner Cc: Al Viro Cc: Ajay Kaher Suggested-by: Linus Torvalds Signed-off-by: Steven Rostedt (Google) Reviewed-by: Kees Cook --- fs/tracefs/event_inode.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c index fdff53d5a1f8..5edf0b96758b 100644 --- a/fs/tracefs/event_inode.c +++ b/fs/tracefs/event_inode.c @@ -32,6 +32,10 @@ */ static DEFINE_MUTEX(eventfs_mutex); =20 +/* Choose something "unique" ;-) */ +#define EVENTFS_FILE_INODE_INO 0x12c4e37 +#define EVENTFS_DIR_INODE_INO 0x134b2f5 + /* * The eventfs_inode (ei) itself is protected by SRCU. It is released from * its parent's list and will have is_freed set (under eventfs_mutex). @@ -352,6 +356,9 @@ static struct dentry *create_file(const char *name, umo= de_t mode, inode->i_fop =3D fop; inode->i_private =3D data; =20 + /* All files will have the same inode number */ + inode->i_ino =3D EVENTFS_FILE_INODE_INO; + ti =3D get_tracefs(inode); ti->flags |=3D TRACEFS_EVENT_INODE; d_instantiate(dentry, inode); @@ -388,6 +395,9 @@ static struct dentry *create_dir(struct eventfs_inode *= ei, struct dentry *parent inode->i_op =3D &eventfs_root_dir_inode_operations; inode->i_fop =3D &eventfs_file_operations; =20 + /* All directories will have the same inode number */ + inode->i_ino =3D EVENTFS_DIR_INODE_INO; + ti =3D get_tracefs(inode); ti->flags |=3D TRACEFS_EVENT_INODE; =20 --=20 2.43.0 From nobody Thu Dec 25 10:30:23 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1964E208B8 for ; Wed, 17 Jan 2024 14:36:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; cv=none; b=CUsBIXHYQ6o0pk2AX6IFBWQHNfqdoETU0hOZWlV1pEp6J75CZ8ZU6D9RXrZ6rBZoloXNPEoZbHWGvArfUhcPrQWVAiwfnN82Y6dVB6Mv/oGAigj2y+BRTkkEYzy4SQ8YMGuocbTY6AT26LB2lmFzVohvt49KHqQkTBNjmPte4tY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; c=relaxed/simple; bh=kdVBorZqhrlAzJnHue7J/bMpS46boW/Qgz6CQs5DZdw=; h=Received:Received:Message-ID:User-Agent:Date:From:To:Cc:Subject: References:MIME-Version:Content-Type; b=dTCOH59Tp6i/Tjkk15xGsn1jmOnmEXrcBOHTrQE7sS0hjiOc0uddUCJe8uHkmjanpqUuSlXLIN6yYVSjFWKHEbv5XV6ncBrMftzCGO426vbDVlCKjwqGk1peewdN/aXH60Vs/k+jGyZuQ2OKOuYQ2Pp90vlMsGs5w+nKL6rpGxg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id ADC4DC433A6; Wed, 17 Jan 2024 14:36:53 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.97) (envelope-from ) id 1rQ73S-00000001XYZ-3VEu; Wed, 17 Jan 2024 09:38:10 -0500 Message-ID: <20240117143810.698692638@goodmis.org> User-Agent: quilt/0.67 Date: Wed, 17 Jan 2024 09:35:50 -0500 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Linus Torvalds , Christian Brauner , Al Viro , Ajay Kaher , kernel test robot Subject: [for-linus][PATCH 2/3] eventfs: Do not create dentries nor inodes in iterate_shared References: <20240117143548.595884070@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Steven Rostedt (Google)" The original eventfs code added a wrapper around the dcache_readdir open callback and created all the dentries and inodes at open, and increment their ref count. A wrapper was added around the dcache_readdir release function to decrement all the ref counts of those created inodes and dentries. But this proved to be buggy[1] for when a kprobe was created during a dir read, it would create a dentry between the open and the release, and because the release would decrement all ref counts of all files and directories, that would include the kprobe directory that was not there to have its ref count incremented in open. This would cause the ref count to go to negative and later crash the kernel. To solve this, the dentries and inodes that were created and had their ref count upped in open needed to be saved. That list needed to be passed from the open to the release, so that the release would only decrement the ref counts of the entries that were incremented in the open. Unfortunately, the dcache_readdir logic was already using the file->private_data, which is the only field that can be used to pass information from the open to the release. What was done was the eventfs created another descriptor that had a void pointer to save the dcache_readdir pointer, and it wrapped all the callbacks, so that it could save the list of entries that had their ref counts incremented in the open, and pass it to the release. The wrapped callbacks would just put back the dcache_readdir pointer and call the functions it used so it could still use its data[2]. But Linus had an issue with the "hijacking" of the file->private_data (unfortunately this discussion was on a security list, so no public link). Which we finally agreed on doing everything within the iterate_shared callback and leave the dcache_readdir out of it[3]. All the information needed for the getents() could be created then. But this ended up being buggy too[4]. The iterate_shared callback was not the right place to create the dentries and inodes. Even Christian Brauner had issues with that[5]. An attempt was to go back to creating the inodes and dentries at the open, create an array to store the information in the file->private_data, and pass that information to the other callbacks.[6] The difference between that and the original method, is that it does not use dcache_readdir. It also does not up the ref counts of the dentries and pass them. Instead, it creates an array of a structure that saves the dentry's name and inode number. That information is used in the iterate_shared callback, and the array is freed in the dir release. The dentries and inodes created in the open are not used for the iterate_share or release callbacks. Just their names and inode numbers. Linus did not like that either[7] and just wanted to remove the dentries being created in iterate_shared and use the hard coded inode numbers. [ All this while Linus enjoyed an unexpected vacation during the merge window due to lack of power. ] [1] https://lore.kernel.org/linux-trace-kernel/20230919211804.230edf1e@gand= alf.local.home/ [2] https://lore.kernel.org/linux-trace-kernel/20230922163446.1431d4fa@gand= alf.local.home/ [3] https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goo= dmis.org/ [4] https://lore.kernel.org/all/202401152142.bfc28861-oliver.sang@intel.com/ [5] https://lore.kernel.org/all/20240111-unzahl-gefegt-433acb8a841d@brauner/ [6] https://lore.kernel.org/all/20240116114711.7e8637be@gandalf.local.home/ [7] https://lore.kernel.org/all/20240116170154.5bf0a250@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.573784051@g= oodmis.org Cc: Masami Hiramatsu Cc: Mark Rutland Cc: Mathieu Desnoyers Cc: Linus Torvalds Cc: Christian Brauner Cc: Al Viro Cc: Ajay Kaher Fixes: 493ec81a8fb8 ("eventfs: Stop using dcache_readdir() for getdents()") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202401152142.bfc28861-oliver.sang@in= tel.com Signed-off-by: Steven Rostedt (Google) --- fs/tracefs/event_inode.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c index 5edf0b96758b..10580d6b5012 100644 --- a/fs/tracefs/event_inode.c +++ b/fs/tracefs/event_inode.c @@ -727,8 +727,6 @@ static int eventfs_iterate(struct file *file, struct di= r_context *ctx) struct eventfs_inode *ei_child; struct tracefs_inode *ti; struct eventfs_inode *ei; - struct dentry *ei_dentry =3D NULL; - struct dentry *dentry; const char *name; umode_t mode; int idx; @@ -749,11 +747,11 @@ static int eventfs_iterate(struct file *file, struct = dir_context *ctx) =20 mutex_lock(&eventfs_mutex); ei =3D READ_ONCE(ti->private); - if (ei && !ei->is_freed) - ei_dentry =3D READ_ONCE(ei->dentry); + if (ei && ei->is_freed) + ei =3D NULL; mutex_unlock(&eventfs_mutex); =20 - if (!ei || !ei_dentry) + if (!ei) goto out; =20 /* @@ -780,11 +778,7 @@ static int eventfs_iterate(struct file *file, struct d= ir_context *ctx) if (r <=3D 0) continue; =20 - dentry =3D create_file_dentry(ei, i, ei_dentry, name, mode, cdata, fops); - if (!dentry) - goto out; - ino =3D dentry->d_inode->i_ino; - dput(dentry); + ino =3D EVENTFS_FILE_INODE_INO; =20 if (!dir_emit(ctx, name, strlen(name), ino, DT_REG)) goto out; @@ -808,11 +802,7 @@ static int eventfs_iterate(struct file *file, struct d= ir_context *ctx) =20 name =3D ei_child->name; =20 - dentry =3D create_dir_dentry(ei, ei_child, ei_dentry); - if (!dentry) - goto out_dec; - ino =3D dentry->d_inode->i_ino; - dput(dentry); + ino =3D EVENTFS_DIR_INODE_INO; =20 if (!dir_emit(ctx, name, strlen(name), ino, DT_DIR)) goto out_dec; --=20 2.43.0 From nobody Thu Dec 25 10:30:23 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 750BD20B35 for ; Wed, 17 Jan 2024 14:36:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; cv=none; b=rByHLMiISWrTqzOqKfBnSOESQkJa5oVs2MF5+al2QJGanJLGysG/3focToT0S6T7rOI4q1W9O72imtFcpbksblxUOb+uMILsDtpw+naBNIAvIDp49ulT5TTsKaYkMM9y8cBjaQCHjeh//iXFfaQqH7nmlyJ2/J1rIKAvB63ZFos= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705502214; c=relaxed/simple; bh=GD76TU70lFmz5imut84Og/YwBOUVo8vHQeAC83rp8IM=; h=Received:Received:Message-ID:User-Agent:Date:From:To:Cc:Subject: References:MIME-Version:Content-Type; b=Qp1rqwTqHRhsa2b1GT2qd4bc0r0UFnciQhEV2fJyP1IKHBe4y38DopnMeh2+Bb8bb8bH6N6hcZVRoWlEiRFkK/63FAjrUNLCg7ByIr0R+Bg1VJByKx2ncMXg+oKWsioq13JGrMi+xDLxeY7CUp8R8vvWXX5WbXK4R3PJxDucWh0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09B2EC433C7; Wed, 17 Jan 2024 14:36:54 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.97) (envelope-from ) id 1rQ73S-00000001XZ3-49ps; Wed, 17 Jan 2024 09:38:10 -0500 Message-ID: <20240117143810.857039986@goodmis.org> User-Agent: quilt/0.67 Date: Wed, 17 Jan 2024 09:35:51 -0500 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Erick Archer , "Gustavo A. R. Silva" Subject: [for-linus][PATCH 3/3] eventfs: Use kcalloc() instead of kzalloc() References: <20240117143548.595884070@goodmis.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Erick Archer As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function. [1] https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded= -arithmetic-in-allocator-arguments Link: https://lore.kernel.org/linux-trace-kernel/20240115181658.4562-1-eric= k.archer@gmx.com Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: Mark Rutland Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Erick Archer Reviewed-by: Gustavo A. R. Silva Signed-off-by: Steven Rostedt (Google) --- fs/tracefs/event_inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c index 10580d6b5012..6795fda2af19 100644 --- a/fs/tracefs/event_inode.c +++ b/fs/tracefs/event_inode.c @@ -97,7 +97,7 @@ static int eventfs_set_attr(struct mnt_idmap *idmap, stru= ct dentry *dentry, /* Preallocate the children mode array if necessary */ if (!(dentry->d_inode->i_mode & S_IFDIR)) { if (!ei->entry_attrs) { - ei->entry_attrs =3D kzalloc(sizeof(*ei->entry_attrs) * ei->nr_entries, + ei->entry_attrs =3D kcalloc(ei->nr_entries, sizeof(*ei->entry_attrs), GFP_NOFS); if (!ei->entry_attrs) { ret =3D -ENOMEM; @@ -874,7 +874,7 @@ struct eventfs_inode *eventfs_create_dir(const char *na= me, struct eventfs_inode } =20 if (size) { - ei->d_children =3D kzalloc(sizeof(*ei->d_children) * size, GFP_KERNEL); + ei->d_children =3D kcalloc(size, sizeof(*ei->d_children), GFP_KERNEL); if (!ei->d_children) { kfree_const(ei->name); kfree(ei); @@ -941,7 +941,7 @@ struct eventfs_inode *eventfs_create_events_dir(const c= har *name, struct dentry goto fail; =20 if (size) { - ei->d_children =3D kzalloc(sizeof(*ei->d_children) * size, GFP_KERNEL); + ei->d_children =3D kcalloc(size, sizeof(*ei->d_children), GFP_KERNEL); if (!ei->d_children) goto fail; } --=20 2.43.0