[PATCH] kcov: don't lose track of remote references during softirqs

Aleksandr Nogikh posted 1 patch 1 year, 8 months ago
There is a newer version of this series
include/linux/kcov.h | 2 ++
kernel/kcov.c        | 1 +
2 files changed, 3 insertions(+)
[PATCH] kcov: don't lose track of remote references during softirqs
Posted by Aleksandr Nogikh 1 year, 8 months ago
In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
metadata of the current task into a per-CPU variable. However, the
kcov_mode_enabled(mode) check is not sufficient in the case of remote
KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
for remote KCOV objects.

If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
happens to get interrupted and kcov_remote_start() is called, it
ultimately leads to kcov_remote_stop() NOT restoring the original
KCOV reference. So when the task exits, all registered remote KCOV
handles remain active forever.

Fix it by introducing a special kcov_mode that is assigned to the
task that owns a KCOV remote object. It makes kcov_mode_enabled()
return true and yet does not trigger coverage collection in
__sanitizer_cov_trace_pc() and write_comp_data().

Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
---
 include/linux/kcov.h | 2 ++
 kernel/kcov.c        | 1 +
 2 files changed, 3 insertions(+)

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index b851ba415e03..3b479a3d235a 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -21,6 +21,8 @@ enum kcov_mode {
 	KCOV_MODE_TRACE_PC = 2,
 	/* Collecting comparison operands mode. */
 	KCOV_MODE_TRACE_CMP = 3,
+	/* The process owns a KCOV remote reference. */
+	KCOV_MODE_REMOTE = 4,
 };
 
 #define KCOV_IN_CTXSW	(1 << 30)
diff --git a/kernel/kcov.c b/kernel/kcov.c
index c3124f6d5536..5371d3f7b5c3 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
 			return -EINVAL;
 		kcov->mode = mode;
 		t->kcov = kcov;
+		WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
 		kcov->t = t;
 		kcov->remote = true;
 		kcov->remote_size = remote_arg->area_size;
-- 
2.45.2.505.gda0bf45e8d-goog
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Andrey Konovalov 1 year, 8 months ago
On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <nogikh@google.com> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> ---
>  include/linux/kcov.h | 2 ++
>  kernel/kcov.c        | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
>         KCOV_MODE_TRACE_PC = 2,
>         /* Collecting comparison operands mode. */
>         KCOV_MODE_TRACE_CMP = 3,
> +       /* The process owns a KCOV remote reference. */
> +       KCOV_MODE_REMOTE = 4,
>  };
>
>  #define KCOV_IN_CTXSW  (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
>                         return -EINVAL;
>                 kcov->mode = mode;
>                 t->kcov = kcov;
> +               WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);

Looking at this again, I don't think we need this WRITE_ONCE here, as
we have interrupts disabled. But if we do, perhaps it makes sense to
add a comment explaining why.

>                 kcov->t = t;
>                 kcov->remote = true;
>                 kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Aleksandr Nogikh 1 year, 8 months ago
On Fri, Jun 14, 2024 at 1:02 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <nogikh@google.com> wrote:
> >
> > In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> > metadata of the current task into a per-CPU variable. However, the
> > kcov_mode_enabled(mode) check is not sufficient in the case of remote
> > KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> > for remote KCOV objects.
> >
> > If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> > happens to get interrupted and kcov_remote_start() is called, it
> > ultimately leads to kcov_remote_stop() NOT restoring the original
> > KCOV reference. So when the task exits, all registered remote KCOV
> > handles remain active forever.
> >
> > Fix it by introducing a special kcov_mode that is assigned to the
> > task that owns a KCOV remote object. It makes kcov_mode_enabled()
> > return true and yet does not trigger coverage collection in
> > __sanitizer_cov_trace_pc() and write_comp_data().
> >
> > Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
> > Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> > ---
> >  include/linux/kcov.h | 2 ++
> >  kernel/kcov.c        | 1 +
> >  2 files changed, 3 insertions(+)
> >
> > diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> > index b851ba415e03..3b479a3d235a 100644
> > --- a/include/linux/kcov.h
> > +++ b/include/linux/kcov.h
> > @@ -21,6 +21,8 @@ enum kcov_mode {
> >         KCOV_MODE_TRACE_PC = 2,
> >         /* Collecting comparison operands mode. */
> >         KCOV_MODE_TRACE_CMP = 3,
> > +       /* The process owns a KCOV remote reference. */
> > +       KCOV_MODE_REMOTE = 4,
> >  };
> >
> >  #define KCOV_IN_CTXSW  (1 << 30)
> > diff --git a/kernel/kcov.c b/kernel/kcov.c
> > index c3124f6d5536..5371d3f7b5c3 100644
> > --- a/kernel/kcov.c
> > +++ b/kernel/kcov.c
> > @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
> >                         return -EINVAL;
> >                 kcov->mode = mode;
> >                 t->kcov = kcov;
> > +               WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
>
> Looking at this again, I don't think we need this WRITE_ONCE here, as
> we have interrupts disabled. But if we do, perhaps it makes sense to
> add a comment explaining why.

Thank you!
I've sent a v2:
https://lore.kernel.org/all/20240614171221.2837584-1-nogikh@google.com/

>
> >                 kcov->t = t;
> >                 kcov->remote = true;
> >                 kcov->remote_size = remote_arg->area_size;
> > --
> > 2.45.2.505.gda0bf45e8d-goog
> >
>
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Andrey Konovalov 1 year, 8 months ago
On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <nogikh@google.com> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> ---
>  include/linux/kcov.h | 2 ++
>  kernel/kcov.c        | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
>         KCOV_MODE_TRACE_PC = 2,
>         /* Collecting comparison operands mode. */
>         KCOV_MODE_TRACE_CMP = 3,
> +       /* The process owns a KCOV remote reference. */
> +       KCOV_MODE_REMOTE = 4,
>  };
>
>  #define KCOV_IN_CTXSW  (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
>                         return -EINVAL;
>                 kcov->mode = mode;
>                 t->kcov = kcov;
> +               WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
>                 kcov->t = t;
>                 kcov->remote = true;
>                 kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>

Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>

Thank you for fixing this!
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Andrew Morton 1 year, 8 months ago
On Tue, 11 Jun 2024 15:32:29 +0200 Aleksandr Nogikh <nogikh@google.com> wrote:

> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
> 
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
> 
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().

What are the userspace visible effects of this bug?  I *think* it's
just an efficiency thing, but how significant?  In other words, should
we backport this fix?
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Aleksandr Nogikh 1 year, 8 months ago
On Tue, Jun 11, 2024 at 8:51 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 11 Jun 2024 15:32:29 +0200 Aleksandr Nogikh <nogikh@google.com> wrote:
>
> > In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> > metadata of the current task into a per-CPU variable. However, the
> > kcov_mode_enabled(mode) check is not sufficient in the case of remote
> > KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> > for remote KCOV objects.
> >
> > If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> > happens to get interrupted and kcov_remote_start() is called, it
> > ultimately leads to kcov_remote_stop() NOT restoring the original
> > KCOV reference. So when the task exits, all registered remote KCOV
> > handles remain active forever.
> >
> > Fix it by introducing a special kcov_mode that is assigned to the
> > task that owns a KCOV remote object. It makes kcov_mode_enabled()
> > return true and yet does not trigger coverage collection in
> > __sanitizer_cov_trace_pc() and write_comp_data().
>
> What are the userspace visible effects of this bug?  I *think* it's
> just an efficiency thing, but how significant?  In other words, should
> we backport this fix?
>

The most uncomfortable effect (at least for syzkaller) is that the bug
prevents the reuse of the same /sys/kernel/debug/kcov descriptor. If
we obtain it in the parent process and then e.g. drop some
capabilities and continuously fork to execute individual programs, at
some point current->kcov of the forked process is lost,
kcov_task_exit() takes no action, and all KCOV_REMOTE_ENABLE ioctls
calls from subsequent forks fail.

And, yes, the efficiency is also affected if we keep on losing remote
kcov objects.
a) kcov_remote_map keeps on growing forever.
b) (If I'm not mistaken), we're also not freeing the memory referenced
by kcov->area.

I think it would be nice to backport the fix to the stable trees.

-- 
Aleksandr
Re: [PATCH] kcov: don't lose track of remote references during softirqs
Posted by Dmitry Vyukov 1 year, 8 months ago
On Tue, 11 Jun 2024 at 15:32, Aleksandr Nogikh <nogikh@google.com> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <nogikh@google.com>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")

Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

> ---
>  include/linux/kcov.h | 2 ++
>  kernel/kcov.c        | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
>         KCOV_MODE_TRACE_PC = 2,
>         /* Collecting comparison operands mode. */
>         KCOV_MODE_TRACE_CMP = 3,
> +       /* The process owns a KCOV remote reference. */
> +       KCOV_MODE_REMOTE = 4,
>  };
>
>  #define KCOV_IN_CTXSW  (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
>                         return -EINVAL;
>                 kcov->mode = mode;
>                 t->kcov = kcov;
> +               WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
>                 kcov->t = t;
>                 kcov->remote = true;
>                 kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>