[PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted

Ian Rogers posted 7 patches 3 months, 1 week ago
[PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Ian Rogers 3 months, 1 week ago
Reading through the evsel->evlist may seg fault if a sample arrives
when the evlist is being deleted. Detect this case and ignore samples
arriving when the evlist is being deleted.

Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/intel-tpebs.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
index 4ad4bc118ea5..3b92ebf5c112 100644
--- a/tools/perf/util/intel-tpebs.c
+++ b/tools/perf/util/intel-tpebs.c
@@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
 
 static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
 {
-	pid_t workload_pid = t->evsel->evlist->workload.pid;
-	pid_t sample_pid = sample->pid;
+	pid_t workload_pid, sample_pid = sample->pid;
 
+	/*
+	 * During evlist__purge the evlist will be removed prior to the
+	 * evsel__exit calling evsel__tpebs_close and taking the
+	 * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
+	 */
+	if (t->evsel->evlist == NULL)
+		return true;
+
+	workload_pid = t->evsel->evlist->workload.pid;
 	if (workload_pid < 0 || workload_pid == sample_pid)
 		return false;
 
-- 
2.49.0.1238.gf8c92423fb-goog
Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Namhyung Kim 3 months, 1 week ago
Hi Ian,

On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> Reading through the evsel->evlist may seg fault if a sample arrives
> when the evlist is being deleted. Detect this case and ignore samples
> arriving when the evlist is being deleted.
> 
> Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/intel-tpebs.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> index 4ad4bc118ea5..3b92ebf5c112 100644
> --- a/tools/perf/util/intel-tpebs.c
> +++ b/tools/perf/util/intel-tpebs.c
> @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
>  
>  static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
>  {
> -	pid_t workload_pid = t->evsel->evlist->workload.pid;
> -	pid_t sample_pid = sample->pid;
> +	pid_t workload_pid, sample_pid = sample->pid;
>  
> +	/*
> +	 * During evlist__purge the evlist will be removed prior to the
> +	 * evsel__exit calling evsel__tpebs_close and taking the
> +	 * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> +	 */
> +	if (t->evsel->evlist == NULL)
> +		return true;
> +
> +	workload_pid = t->evsel->evlist->workload.pid;

I'm curious if there's a chance of TOCTOU race.  It'd certainly help
the segfault but would this code prevent it completely?

Thanks,
Namhyung


>  	if (workload_pid < 0 || workload_pid == sample_pid)
>  		return false;
>  
> -- 
> 2.49.0.1238.gf8c92423fb-goog
>
Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Ian Rogers 3 months, 1 week ago
On Wed, May 28, 2025 at 10:53 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> > Reading through the evsel->evlist may seg fault if a sample arrives
> > when the evlist is being deleted. Detect this case and ignore samples
> > arriving when the evlist is being deleted.
> >
> > Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/intel-tpebs.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> > index 4ad4bc118ea5..3b92ebf5c112 100644
> > --- a/tools/perf/util/intel-tpebs.c
> > +++ b/tools/perf/util/intel-tpebs.c
> > @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
> >
> >  static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
> >  {
> > -     pid_t workload_pid = t->evsel->evlist->workload.pid;
> > -     pid_t sample_pid = sample->pid;
> > +     pid_t workload_pid, sample_pid = sample->pid;
> >
> > +     /*
> > +      * During evlist__purge the evlist will be removed prior to the
> > +      * evsel__exit calling evsel__tpebs_close and taking the
> > +      * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> > +      */
> > +     if (t->evsel->evlist == NULL)
> > +             return true;
> > +
> > +     workload_pid = t->evsel->evlist->workload.pid;
>
> I'm curious if there's a chance of TOCTOU race.  It'd certainly help
> the segfault but would this code prevent it completely?

Good point. I think the race is already small as it doesn't happen
without sanitizers for me.
Thinking about the evlist problem. When a destructor (evlist__delete)
it is generally assumed the code is being single threaded and in C++
clang's -Wthread-safety will ignore destructors for this reason
(annoying imo as it hides bugs). I don't see a good way to solve that
for the evlist and evsel for the TPEBS case without using reference
counting. Adding reference counts to evlist and evsel would be do-able
as we could use reference count checking, but it would be a large and
invasive change. Wdyt?

Thanks,
Ian

> Thanks,
> Namhyung
>
>
> >       if (workload_pid < 0 || workload_pid == sample_pid)
> >               return false;
> >
> > --
> > 2.49.0.1238.gf8c92423fb-goog
> >
Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Namhyung Kim 3 months, 1 week ago
On Wed, May 28, 2025 at 11:02:44AM -0700, Ian Rogers wrote:
> On Wed, May 28, 2025 at 10:53 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> > > Reading through the evsel->evlist may seg fault if a sample arrives
> > > when the evlist is being deleted. Detect this case and ignore samples
> > > arriving when the evlist is being deleted.
> > >
> > > Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/intel-tpebs.c | 12 ++++++++++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> > > index 4ad4bc118ea5..3b92ebf5c112 100644
> > > --- a/tools/perf/util/intel-tpebs.c
> > > +++ b/tools/perf/util/intel-tpebs.c
> > > @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
> > >
> > >  static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
> > >  {
> > > -     pid_t workload_pid = t->evsel->evlist->workload.pid;
> > > -     pid_t sample_pid = sample->pid;
> > > +     pid_t workload_pid, sample_pid = sample->pid;
> > >
> > > +     /*
> > > +      * During evlist__purge the evlist will be removed prior to the
> > > +      * evsel__exit calling evsel__tpebs_close and taking the
> > > +      * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> > > +      */
> > > +     if (t->evsel->evlist == NULL)
> > > +             return true;
> > > +
> > > +     workload_pid = t->evsel->evlist->workload.pid;
> >
> > I'm curious if there's a chance of TOCTOU race.  It'd certainly help
> > the segfault but would this code prevent it completely?
> 
> Good point. I think the race is already small as it doesn't happen
> without sanitizers for me.
> Thinking about the evlist problem. When a destructor (evlist__delete)
> it is generally assumed the code is being single threaded and in C++
> clang's -Wthread-safety will ignore destructors for this reason
> (annoying imo as it hides bugs). I don't see a good way to solve that
> for the evlist and evsel for the TPEBS case without using reference
> counting. Adding reference counts to evlist and evsel would be do-able
> as we could use reference count checking, but it would be a large and
> invasive change. Wdyt?

Would it be possible to kill the TPEBS thread before deleting evlist?

Thanks,
Namhyung

Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Ian Rogers 3 months, 1 week ago
On Wed, May 28, 2025 at 1:13 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, May 28, 2025 at 11:02:44AM -0700, Ian Rogers wrote:
> > On Wed, May 28, 2025 at 10:53 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> > > > Reading through the evsel->evlist may seg fault if a sample arrives
> > > > when the evlist is being deleted. Detect this case and ignore samples
> > > > arriving when the evlist is being deleted.
> > > >
> > > > Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > ---
> > > >  tools/perf/util/intel-tpebs.c | 12 ++++++++++--
> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> > > > index 4ad4bc118ea5..3b92ebf5c112 100644
> > > > --- a/tools/perf/util/intel-tpebs.c
> > > > +++ b/tools/perf/util/intel-tpebs.c
> > > > @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
> > > >
> > > >  static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
> > > >  {
> > > > -     pid_t workload_pid = t->evsel->evlist->workload.pid;
> > > > -     pid_t sample_pid = sample->pid;
> > > > +     pid_t workload_pid, sample_pid = sample->pid;
> > > >
> > > > +     /*
> > > > +      * During evlist__purge the evlist will be removed prior to the
> > > > +      * evsel__exit calling evsel__tpebs_close and taking the
> > > > +      * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> > > > +      */
> > > > +     if (t->evsel->evlist == NULL)
> > > > +             return true;
> > > > +
> > > > +     workload_pid = t->evsel->evlist->workload.pid;
> > >
> > > I'm curious if there's a chance of TOCTOU race.  It'd certainly help
> > > the segfault but would this code prevent it completely?
> >
> > Good point. I think the race is already small as it doesn't happen
> > without sanitizers for me.
> > Thinking about the evlist problem. When a destructor (evlist__delete)
> > it is generally assumed the code is being single threaded and in C++
> > clang's -Wthread-safety will ignore destructors for this reason
> > (annoying imo as it hides bugs). I don't see a good way to solve that
> > for the evlist and evsel for the TPEBS case without using reference
> > counting. Adding reference counts to evlist and evsel would be do-able
> > as we could use reference count checking, but it would be a large and
> > invasive change. Wdyt?
>
> Would it be possible to kill the TPEBS thread before deleting evlist?

The TPEBS thread and other data structures are global and not tied to
the evlist, so there can and are multiple evlists at play. When using
TPEBS there is the evlist for perf stat, there is also the evlist for
the samples. There's sense in having the evlist own the TPEBS data
structures, there's also sense in things being global. I think if I'd
done it I'd have gone with TPEBS within the evlist, but I suspect in
the original changes there was a worry about adding cost on non-x86
builds.

Thanks,
Ian

> Thanks,
> Namhyung
>
Re: [PATCH v2 4/7] perf intel-tpebs: Avoid race when evlist is being deleted
Posted by Namhyung Kim 3 months, 1 week ago
On Wed, May 28, 2025 at 01:44:36PM -0700, Ian Rogers wrote:
> On Wed, May 28, 2025 at 1:13 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, May 28, 2025 at 11:02:44AM -0700, Ian Rogers wrote:
> > > On Wed, May 28, 2025 at 10:53 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hi Ian,
> > > >
> > > > On Tue, May 27, 2025 at 08:26:34PM -0700, Ian Rogers wrote:
> > > > > Reading through the evsel->evlist may seg fault if a sample arrives
> > > > > when the evlist is being deleted. Detect this case and ignore samples
> > > > > arriving when the evlist is being deleted.
> > > > >
> > > > > Fixes: bcfab08db7fb ("perf intel-tpebs: Filter non-workload samples")
> > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > ---
> > > > >  tools/perf/util/intel-tpebs.c | 12 ++++++++++--
> > > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> > > > > index 4ad4bc118ea5..3b92ebf5c112 100644
> > > > > --- a/tools/perf/util/intel-tpebs.c
> > > > > +++ b/tools/perf/util/intel-tpebs.c
> > > > > @@ -162,9 +162,17 @@ static bool is_child_pid(pid_t parent, pid_t child)
> > > > >
> > > > >  static bool should_ignore_sample(const struct perf_sample *sample, const struct tpebs_retire_lat *t)
> > > > >  {
> > > > > -     pid_t workload_pid = t->evsel->evlist->workload.pid;
> > > > > -     pid_t sample_pid = sample->pid;
> > > > > +     pid_t workload_pid, sample_pid = sample->pid;
> > > > >
> > > > > +     /*
> > > > > +      * During evlist__purge the evlist will be removed prior to the
> > > > > +      * evsel__exit calling evsel__tpebs_close and taking the
> > > > > +      * tpebs_mtx. Avoid a segfault by ignoring samples in this case.
> > > > > +      */
> > > > > +     if (t->evsel->evlist == NULL)
> > > > > +             return true;
> > > > > +
> > > > > +     workload_pid = t->evsel->evlist->workload.pid;
> > > >
> > > > I'm curious if there's a chance of TOCTOU race.  It'd certainly help
> > > > the segfault but would this code prevent it completely?
> > >
> > > Good point. I think the race is already small as it doesn't happen
> > > without sanitizers for me.
> > > Thinking about the evlist problem. When a destructor (evlist__delete)
> > > it is generally assumed the code is being single threaded and in C++
> > > clang's -Wthread-safety will ignore destructors for this reason
> > > (annoying imo as it hides bugs). I don't see a good way to solve that
> > > for the evlist and evsel for the TPEBS case without using reference
> > > counting. Adding reference counts to evlist and evsel would be do-able
> > > as we could use reference count checking, but it would be a large and
> > > invasive change. Wdyt?
> >
> > Would it be possible to kill the TPEBS thread before deleting evlist?
> 
> The TPEBS thread and other data structures are global and not tied to
> the evlist, so there can and are multiple evlists at play. When using
> TPEBS there is the evlist for perf stat, there is also the evlist for
> the samples. There's sense in having the evlist own the TPEBS data
> structures, there's also sense in things being global. I think if I'd
> done it I'd have gone with TPEBS within the evlist, but I suspect in
> the original changes there was a worry about adding cost on non-x86
> builds.

Ok, I thought deleting evlist is quite late in the execution and it may
be easy to make the change.  If not, let's see how the current fix will
work before going further. :)

Thanks,
Namhyung