[PATCH 1/3] seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited

Andrei Vagin posted 3 patches 1 year, 8 months ago
There is a newer version of this series
[PATCH 1/3] seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
Posted by Andrei Vagin 1 year, 8 months ago
SECCOMP_IOCTL_NOTIF_RECV promptly returns when a seccomp filter becomes
unused, as a filter without users can't trigger any events.

Previously, event listeners had to rely on epoll to detect when all
processes had exited.

The change is based on the 'commit 99cdb8b9a573 ("seccomp: notify about
unused filter")' which implemented (E)POLLHUP notifications.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
---
 kernel/seccomp.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f70e031e06a8..35435e8f1035 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1466,7 +1466,7 @@ static int recv_wake_function(wait_queue_entry_t *wait, unsigned int mode, int s
 				  void *key)
 {
 	/* Avoid a wakeup if event not interesting for us. */
-	if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR)))
+	if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR | EPOLLHUP)))
 		return 0;
 	return autoremove_wake_function(wait, mode, sync, key);
 }
@@ -1476,6 +1476,9 @@ static int recv_wait_event(struct seccomp_filter *filter)
 	DEFINE_WAIT_FUNC(wait, recv_wake_function);
 	int ret;
 
+	if (refcount_read(&filter->users) == 0)
+		return 0;
+
 	if (atomic_dec_if_positive(&filter->notif->requests) >= 0)
 		return 0;
 
@@ -1484,6 +1487,8 @@ static int recv_wait_event(struct seccomp_filter *filter)
 
 		if (atomic_dec_if_positive(&filter->notif->requests) >= 0)
 			break;
+		if (refcount_read(&filter->users) == 0)
+			break;
 
 		if (ret)
 			return ret;
-- 
2.45.1.288.g0e0cd299f1-goog
Re: [PATCH 1/3] seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
Posted by Oleg Nesterov 1 year, 8 months ago
Hi Andrei,

the patch looks good to me even if I don't really understand what
SECCOMP_IOCTL_NOTIF_RECV does. But let me ask a stupid question,

On 05/23, Andrei Vagin wrote:
>
> The change is based on the 'commit 99cdb8b9a573 ("seccomp: notify about
> unused filter")' which implemented (E)POLLHUP notifications.

To me this patch fixes the commit above, because without this change

> @@ -1466,7 +1466,7 @@ static int recv_wake_function(wait_queue_entry_t *wait, unsigned int mode, int s
>  				  void *key)
>  {
>  	/* Avoid a wakeup if event not interesting for us. */
> -	if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR)))
> +	if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR | EPOLLHUP)))

__seccomp_filter_orphan() -> wake_up_poll(&orig->wqh, EPOLLHUP) won't
wakeup the task sleeping in recv_wait_event(), right ?

In any case, FWIW

Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Re: [PATCH 1/3] seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
Posted by Andrei Vagin 1 year, 8 months ago
On Thu, May 23, 2024 at 2:00 AM Oleg Nesterov <oleg@redhat.com> wrote:
>
> Hi Andrei,
>
> the patch looks good to me even if I don't really understand what
> SECCOMP_IOCTL_NOTIF_RECV does. But let me ask a stupid question,
>
> On 05/23, Andrei Vagin wrote:
> >
> > The change is based on the 'commit 99cdb8b9a573 ("seccomp: notify about
> > unused filter")' which implemented (E)POLLHUP notifications.
>
> To me this patch fixes the commit above, because without this change

It depends on how we look at it. I think the intention was to address
the epoll/poll/select syscalls to return (E)POLLHUP notifications when
filters have been orphaned. Plus, this code looked a bit different that
time and recv_wake_function used another notification mechanism.

>
> > @@ -1466,7 +1466,7 @@ static int recv_wake_function(wait_queue_entry_t *wait, unsigned int mode, int s
> >                                 void *key)
> >  {
> >       /* Avoid a wakeup if event not interesting for us. */
> > -     if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR)))
> > +     if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR | EPOLLHUP)))
>
> __seccomp_filter_orphan() -> wake_up_poll(&orig->wqh, EPOLLHUP) won't
> wakeup the task sleeping in recv_wait_event(), right ?
>
> In any case, FWIW
>
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>

Thanks,
Andrei