[PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set

Steven Rostedt posted 18 patches 5 months ago
[PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set
Posted by Steven Rostedt 5 months ago
From: Josh Poimboeuf <jpoimboe@kernel.org>

get_perf_callchain() doesn't support cross-task unwinding for user space
stacks, have it return NULL if both the crosstask and user arguments are
set.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/events/callchain.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index b0f5bd228cd8..abf258913ab6 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
 	struct perf_callchain_entry_ctx ctx;
 	int rctx, start_entry_idx;
 
+	/* crosstask is not supported for user stacks */
+	if (crosstask && user)
+		return NULL;
+
 	entry = get_callchain_entry(&rctx);
 	if (!entry)
 		return NULL;
@@ -249,9 +253,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
 		}
 
 		if (regs) {
-			if (crosstask)
-				goto exit_put;
-
 			if (add_mark)
 				perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
 
@@ -261,7 +262,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
 		}
 	}
 
-exit_put:
 	put_callchain_entry(rctx);
 
 	return entry;
-- 
2.47.2
Re: [PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set
Posted by Andrii Nakryiko 5 months ago
On Fri, May 9, 2025 at 9:52 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> From: Josh Poimboeuf <jpoimboe@kernel.org>
>
> get_perf_callchain() doesn't support cross-task unwinding for user space
> stacks, have it return NULL if both the crosstask and user arguments are
> set.
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  kernel/events/callchain.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
> index b0f5bd228cd8..abf258913ab6 100644
> --- a/kernel/events/callchain.c
> +++ b/kernel/events/callchain.c
> @@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
>         struct perf_callchain_entry_ctx ctx;
>         int rctx, start_entry_idx;
>
> +       /* crosstask is not supported for user stacks */
> +       if (crosstask && user)
> +               return NULL;

I think get_perf_callchain() supports requesting both user and kernel
stack traces, and if it's crosstask, you can still get kernel (but not
user) stack, if I'm reading the code correctly.

So by just returning NULL early you will change this behavior, no?

> +
>         entry = get_callchain_entry(&rctx);
>         if (!entry)
>                 return NULL;
> @@ -249,9 +253,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
>                 }
>
>                 if (regs) {
> -                       if (crosstask)
> -                               goto exit_put;
> -
>                         if (add_mark)
>                                 perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
>
> @@ -261,7 +262,6 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
>                 }
>         }
>
> -exit_put:
>         put_callchain_entry(rctx);
>
>         return entry;
> --
> 2.47.2
>
>
>
Re: [PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set
Posted by Josh Poimboeuf 5 months ago
On Fri, May 09, 2025 at 02:53:38PM -0700, Andrii Nakryiko wrote:
> On Fri, May 9, 2025 at 9:52 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > From: Josh Poimboeuf <jpoimboe@kernel.org>
> >
> > get_perf_callchain() doesn't support cross-task unwinding for user space
> > stacks, have it return NULL if both the crosstask and user arguments are
> > set.
> >
> > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > ---
> >  kernel/events/callchain.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
> > index b0f5bd228cd8..abf258913ab6 100644
> > --- a/kernel/events/callchain.c
> > +++ b/kernel/events/callchain.c
> > @@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
> >         struct perf_callchain_entry_ctx ctx;
> >         int rctx, start_entry_idx;
> >
> > +       /* crosstask is not supported for user stacks */
> > +       if (crosstask && user)
> > +               return NULL;
> 
> I think get_perf_callchain() supports requesting both user and kernel
> stack traces, and if it's crosstask, you can still get kernel (but not
> user) stack, if I'm reading the code correctly.
> 
> So by just returning NULL early you will change this behavior, no?

Yeah, that does seem like a bug.

Though crosstask in general is dubious, even for kernel stacks.

If the task is running while you're unwinding it, hilarity ensues.
There are guardrails in place, so it should be safe, it may just produce
nonsense.  But maybe the callers don't need perfection.

But also, it would seem to be a bad idea to allow one task to spy on
what another task's kernelspace is doing.  Does unpriv BPF allow that?

Though it seems even 'cat /proc/self/stack' is a privileged operation
these days, does unpriv BPF allow that as well?

-- 
Josh
Re: [PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set
Posted by Andrii Nakryiko 4 months, 3 weeks ago
On Sat, May 10, 2025 at 10:59 AM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> On Fri, May 09, 2025 at 02:53:38PM -0700, Andrii Nakryiko wrote:
> > On Fri, May 9, 2025 at 9:52 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > From: Josh Poimboeuf <jpoimboe@kernel.org>
> > >
> > > get_perf_callchain() doesn't support cross-task unwinding for user space
> > > stacks, have it return NULL if both the crosstask and user arguments are
> > > set.
> > >
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > > ---
> > >  kernel/events/callchain.c | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
> > > index b0f5bd228cd8..abf258913ab6 100644
> > > --- a/kernel/events/callchain.c
> > > +++ b/kernel/events/callchain.c
> > > @@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
> > >         struct perf_callchain_entry_ctx ctx;
> > >         int rctx, start_entry_idx;
> > >
> > > +       /* crosstask is not supported for user stacks */
> > > +       if (crosstask && user)
> > > +               return NULL;
> >
> > I think get_perf_callchain() supports requesting both user and kernel
> > stack traces, and if it's crosstask, you can still get kernel (but not
> > user) stack, if I'm reading the code correctly.
> >
> > So by just returning NULL early you will change this behavior, no?
>
> Yeah, that does seem like a bug.
>
> Though crosstask in general is dubious, even for kernel stacks.
>
> If the task is running while you're unwinding it, hilarity ensues.
> There are guardrails in place, so it should be safe, it may just produce
> nonsense.  But maybe the callers don't need perfection.
>
> But also, it would seem to be a bad idea to allow one task to spy on
> what another task's kernelspace is doing.  Does unpriv BPF allow that?

No, you need CAP_PERFMON to be able to capture stack traces, so we are
fine from that POV.

But also note that BPF itself doesn't allow requesting both user and
kernel stack traces in one go. So the issue I pointed out is not
really related to how BPF is using get_perf_callchain().

>
> Though it seems even 'cat /proc/self/stack' is a privileged operation
> these days, does unpriv BPF allow that as well?
>
> --
> Josh
Re: [PATCH v8 15/18] perf: Have get_perf_callchain() return NULL if crosstask and user are set
Posted by Steven Rostedt 5 months ago
On Fri, 9 May 2025 14:53:38 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> > @@ -224,6 +224,10 @@ get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
> >         struct perf_callchain_entry_ctx ctx;
> >         int rctx, start_entry_idx;
> >
> > +       /* crosstask is not supported for user stacks */
> > +       if (crosstask && user)
> > +               return NULL;  
> 
> I think get_perf_callchain() supports requesting both user and kernel
> stack traces, and if it's crosstask, you can still get kernel (but not
> user) stack, if I'm reading the code correctly.
> 
> So by just returning NULL early you will change this behavior, no?

Basically you are saying that one could ask for a kernel/user stack trace
with crosstask enabled and still just get the kernel trace?

If this is the case, then I think it may be best to remove patches 15-18
from this series and work on them in the "perf specific" series, as this
doesn't have anything to do with the unwind infrastructure itself.

Actually, patch 14 doesn't either, so I may move that one too (and keep the
acks to it).

Thanks,

-- Steve


> 
> > +
> >         entry = get_callchain_entry(&rctx);
> >         if (!entry)
> >                 return NULL;