[RFC][PATCH] fanotify: register event pid for pidfd reporting

AnonymeMeow posted 1 patch 1 week, 1 day ago
fs/notify/fanotify/fanotify.c | 10 ++++++++++
1 file changed, 10 insertions(+)
[RFC][PATCH] fanotify: register event pid for pidfd reporting
Posted by AnonymeMeow 1 week, 1 day ago
pidfd_prepare() may still fail if userspace reads the event after the
task has been reaped and the pid has not been registered with pidfs.
Register the event pid with pidfs when creating the fanotify event if
pidfd reporting was requested, so pidfd_prepare() can later create a
pidfd for the reaped task.

Signed-off-by: AnonymeMeow <anonymemeow@gmail.com>
---

Hi, Christian,

I tested this patch set and found that fanotify still usually can not
report pidfds for reaped tasks, because the event pid may not have
been registered before it exits, and registering it after the task
has been reaped is not allowed, so fanotify still fails to report a
pidfd for the event.

This patch registers the event pid when the fanotify event is created.
With this change, the current LTP fanotify21 test fails as expected.

But I do not know whether it is worth doing so to improve fanotify's
ability to report pidfds for reaped tasks. And, what should fanotify
do if pidfs_register_pid() fails? Should the error be reported to
userspace through the pidfd info record, using the raw error when
FAN_REPORT_FD_ERROR is set and FAN_EPIDFD otherwise? But the event
struct in fanotify does not have spare space to store this error code.
Or should it just ignore the failure here and let pidfd_prepare()
fail later in copy_event_to_user()? I would appreciate your feedback.

With Best Regards,
AnonymeMeow

Link: https://lore.kernel.org/lkml/20260528-schmuckvoll-heilen-garen-be77b4208671@brauner/

---
 fs/notify/fanotify/fanotify.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 38290b9c07f7..24b2af397126 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -14,6 +14,7 @@
 #include <linux/sched/mm.h>
 #include <linux/statfs.h>
 #include <linux/stringhash.h>
+#include <linux/pidfs.h>
 
 #include "fanotify.h"
 
@@ -868,6 +869,15 @@ static struct fanotify_event *fanotify_alloc_event(
 	else
 		pid = get_pid(task_tgid(current));
 
+	if (FAN_GROUP_FLAG(group, FAN_REPORT_PIDFD)) {
+		int err = pidfs_register_pid(pid);
+		if (err) {
+			/*
+			 * What to do here? Pass this err to userspace via pidfd?
+			 */
+		}
+	}
+
 	/* Mix event info, FAN_ONDIR flag and pid into event merge key */
 	hash ^= hash_long((unsigned long)pid | ondir, FANOTIFY_EVENT_HASH_BITS);
 	fanotify_init_event(event, hash, mask);
-- 
2.54.0
Re: [RFC][PATCH] fanotify: register event pid for pidfd reporting
Posted by Jan Kara 6 days, 17 hours ago
On Sat 30-05-26 18:51:17, AnonymeMeow wrote:
> pidfd_prepare() may still fail if userspace reads the event after the
> task has been reaped and the pid has not been registered with pidfs.
> Register the event pid with pidfs when creating the fanotify event if
> pidfd reporting was requested, so pidfd_prepare() can later create a
> pidfd for the reaped task.
> 
> Signed-off-by: AnonymeMeow <anonymemeow@gmail.com>

So the two patches in this series so far look good to me. This patch looks
OK to me as well and since the overhead of allocation etc. is only once per
registered task, I think it's fine but let's wait a bit whether Christian
agrees this is what needs to be done.

								Honza

> ---
> 
> Hi, Christian,
> 
> I tested this patch set and found that fanotify still usually can not
> report pidfds for reaped tasks, because the event pid may not have
> been registered before it exits, and registering it after the task
> has been reaped is not allowed, so fanotify still fails to report a
> pidfd for the event.
> 
> This patch registers the event pid when the fanotify event is created.
> With this change, the current LTP fanotify21 test fails as expected.
> 
> But I do not know whether it is worth doing so to improve fanotify's
> ability to report pidfds for reaped tasks. And, what should fanotify
> do if pidfs_register_pid() fails? Should the error be reported to
> userspace through the pidfd info record, using the raw error when
> FAN_REPORT_FD_ERROR is set and FAN_EPIDFD otherwise? But the event
> struct in fanotify does not have spare space to store this error code.
> Or should it just ignore the failure here and let pidfd_prepare()
> fail later in copy_event_to_user()? I would appreciate your feedback.
> 
> With Best Regards,
> AnonymeMeow
> 
> Link: https://lore.kernel.org/lkml/20260528-schmuckvoll-heilen-garen-be77b4208671@brauner/
> 
> ---
>  fs/notify/fanotify/fanotify.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 38290b9c07f7..24b2af397126 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -14,6 +14,7 @@
>  #include <linux/sched/mm.h>
>  #include <linux/statfs.h>
>  #include <linux/stringhash.h>
> +#include <linux/pidfs.h>
>  
>  #include "fanotify.h"
>  
> @@ -868,6 +869,15 @@ static struct fanotify_event *fanotify_alloc_event(
>  	else
>  		pid = get_pid(task_tgid(current));
>  
> +	if (FAN_GROUP_FLAG(group, FAN_REPORT_PIDFD)) {
> +		int err = pidfs_register_pid(pid);
> +		if (err) {
> +			/*
> +			 * What to do here? Pass this err to userspace via pidfd?
> +			 */
> +		}
> +	}
> +
>  	/* Mix event info, FAN_ONDIR flag and pid into event merge key */
>  	hash ^= hash_long((unsigned long)pid | ondir, FANOTIFY_EVENT_HASH_BITS);
>  	fanotify_init_event(event, hash, mask);
> -- 
> 2.54.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
Re: [RFC][PATCH] fanotify: register event pid for pidfd reporting
Posted by Christian Brauner 5 days, 18 hours ago
On Mon, Jun 01, 2026 at 05:24:36PM +0200, Jan Kara wrote:
> On Sat 30-05-26 18:51:17, AnonymeMeow wrote:
> > pidfd_prepare() may still fail if userspace reads the event after the
> > task has been reaped and the pid has not been registered with pidfs.
> > Register the event pid with pidfs when creating the fanotify event if
> > pidfd reporting was requested, so pidfd_prepare() can later create a
> > pidfd for the reaped task.
> > 
> > Signed-off-by: AnonymeMeow <anonymemeow@gmail.com>
> 
> So the two patches in this series so far look good to me. This patch looks
> OK to me as well and since the overhead of allocation etc. is only once per

It's a kmem cache so should not matter at all...

> registered task, I think it's fine but let's wait a bit whether Christian
> agrees this is what needs to be done.

pidfs_register_pid() can only fail with ENOMEM or if the task is already
dead. The latter cannot happen because even for a non-empty thread group
where the thread-group leader already exited but subthreads are still
alive the thread-group leader cannot be reaped. So the only error here
is ENOMEM which means we should defo fail this, imho.

And yes, I think this is worth doing. Because you get a persistent
handle on the task that generated that event + all the exit information
even if that thask has been reaped in the meantime. I'm pretty sure if
you're using fanotify that's a rather huge win for apps that need it.