[PATCH] perf trace: Fix IS_ERR() vs NULL check bug

wangguangju@hygon.cn posted 1 patch 1 month ago
tools/perf/builtin-trace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by wangguangju@hygon.cn 1 month ago
From: wangguangju <wangguangju@hygon.cn>

The alloc_syscall_stats() function always returns an error pointer
(ERR_PTR) on failure.

So replace NULL check with IS_ERR() check after calling
delete_syscall_stats() function.

Signed-off-by: wangguangju <wangguangju@hygon.cn>
---
 tools/perf/builtin-trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 311d9da9896a..295b272c6c29 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1573,7 +1573,7 @@ static void delete_syscall_stats(struct hashmap *syscall_stats)
 	struct hashmap_entry *pos;
 	size_t bkt;
 
-	if (syscall_stats == NULL)
+	if (IS_ERR(syscall_stats))
 		return;
 
 	hashmap__for_each_entry(syscall_stats, pos, bkt)
-- 
2.43.0
Re: [PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by Namhyung Kim 1 month ago
On Thu, 26 Feb 2026 20:22:08 +0800, wangguangju@hygon.cn wrote:
> The alloc_syscall_stats() function always returns an error pointer
> (ERR_PTR) on failure.
> 
> So replace NULL check with IS_ERR() check after calling
> delete_syscall_stats() function.
> 
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung
Re: [PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by Ian Rogers 1 month ago
On Fri, Feb 27, 2026 at 1:32 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, 26 Feb 2026 20:22:08 +0800, wangguangju@hygon.cn wrote:
> > The alloc_syscall_stats() function always returns an error pointer
> > (ERR_PTR) on failure.
> >
> > So replace NULL check with IS_ERR() check after calling
> > delete_syscall_stats() function.
> >
> >
> > [...]
> Applied to perf-tools-next, thanks!

This broke the "perf trace summary" test for me (run `perf test -v`).
The failing command under gdb shows:
```
$ gdb --args perf trace -s -- true
...
(gdb) r
...
Program received signal SIGSEGV, Segmentation fault.
0x0000555555672210 in delete_syscall_stats (syscall_stats=0x0) at
builtin-trace.c:1579
1579            hashmap__for_each_entry(syscall_stats, pos, bkt)
(gdb) l
1574            size_t bkt;
1575
1576            if (IS_ERR(syscall_stats))
1577                    return;
1578
1579            hashmap__for_each_entry(syscall_stats, pos, bkt)
1580                    zfree(&pos->pvalue);
1581            hashmap__free(syscall_stats);
1582    }
```

As NULL is false for IS_ERR I think the correct change is:
```
if (syscall_stats == NULL || IS_ERR(syscall_stats))
```

We seem to be routinely breaking tests, which may indicate a lot of
noise in the test output. I see many failures in the type profiling
test due to the typedef vs struct issue.

Thanks,
Ian

> Best regards,
> Namhyung
>
>
Re: [PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by Howard Chu 1 month ago
Hi Ian,

On Fri, Feb 27, 2026 at 10:40 PM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, Feb 27, 2026 at 1:32 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Thu, 26 Feb 2026 20:22:08 +0800, wangguangju@hygon.cn wrote:
> > > The alloc_syscall_stats() function always returns an error pointer
> > > (ERR_PTR) on failure.
> > >
> > > So replace NULL check with IS_ERR() check after calling
> > > delete_syscall_stats() function.
> > >
> > >
> > > [...]
> > Applied to perf-tools-next, thanks!
>
> This broke the "perf trace summary" test for me (run `perf test -v`).
> The failing command under gdb shows:
> ```
> $ gdb --args perf trace -s -- true
> ...
> (gdb) r
> ...
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000555555672210 in delete_syscall_stats (syscall_stats=0x0) at
> builtin-trace.c:1579
> 1579            hashmap__for_each_entry(syscall_stats, pos, bkt)
> (gdb) l
> 1574            size_t bkt;
> 1575
> 1576            if (IS_ERR(syscall_stats))
> 1577                    return;
> 1578
> 1579            hashmap__for_each_entry(syscall_stats, pos, bkt)
> 1580                    zfree(&pos->pvalue);
> 1581            hashmap__free(syscall_stats);
> 1582    }
> ```
>
> As NULL is false for IS_ERR I think the correct change is:
> ```
> if (syscall_stats == NULL || IS_ERR(syscall_stats))
> ```

I suspected the same thing but couldn't find a NULL assignment in the
code, so I rushed the review, I apologize. Yours looks good to me.

Thanks,
Howard

>
> We seem to be routinely breaking tests, which may indicate a lot of
> noise in the test output. I see many failures in the type profiling
> test due to the typedef vs struct issue.
>
> Thanks,
> Ian
>
> > Best regards,
> > Namhyung
> >
> >
>
[PATCH v1] perf trace: Avoid an ERR_PTR in syscall_stats
Posted by Ian Rogers 4 weeks, 1 day ago
hashmap__new may return an ERR_PTR and previously this would be
assigned to syscall_stats meaning all use of syscall_stats needs to
test for NULL (uninitialized) or an ERR_PTR. Given the only reason
hashmap__new can fail is ENOMEM, just use NULL to indicate the
allocation failure and avoid the code having to test for NULL and
IS_ERR.

Fixes: 96f202eab813 (perf trace: Fix IS_ERR() vs NULL check bug)
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-trace.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 295b272c6c29..7ff85fa90d98 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1565,7 +1565,9 @@ static bool syscall_id_equal(long key1, long key2, void *ctx __maybe_unused)
 
 static struct hashmap *alloc_syscall_stats(void)
 {
-	return hashmap__new(syscall_id_hash, syscall_id_equal, NULL);
+	struct hashmap *result = hashmap__new(syscall_id_hash, syscall_id_equal, NULL);
+
+	return IS_ERR(result) ? NULL : result;
 }
 
 static void delete_syscall_stats(struct hashmap *syscall_stats)
@@ -1573,7 +1575,7 @@ static void delete_syscall_stats(struct hashmap *syscall_stats)
 	struct hashmap_entry *pos;
 	size_t bkt;
 
-	if (IS_ERR(syscall_stats))
+	if (!syscall_stats)
 		return;
 
 	hashmap__for_each_entry(syscall_stats, pos, bkt)
@@ -1589,7 +1591,7 @@ static struct thread_trace *thread_trace__new(struct trace *trace)
 		ttrace->files.max = -1;
 		if (trace->summary) {
 			ttrace->syscall_stats = alloc_syscall_stats();
-			if (IS_ERR(ttrace->syscall_stats))
+			if (!ttrace->syscall_stats)
 				zfree(&ttrace);
 		}
 	}
@@ -4464,7 +4466,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 
 	if (trace->summary_mode == SUMMARY__BY_TOTAL && !trace->summary_bpf) {
 		trace->syscall_stats = alloc_syscall_stats();
-		if (IS_ERR(trace->syscall_stats))
+		if (!trace->syscall_stats)
 			goto out_delete_evlist;
 	}
 
@@ -4771,7 +4773,7 @@ static int trace__replay(struct trace *trace)
 
 	if (trace->summary_mode == SUMMARY__BY_TOTAL) {
 		trace->syscall_stats = alloc_syscall_stats();
-		if (IS_ERR(trace->syscall_stats))
+		if (!trace->syscall_stats)
 			goto out;
 	}
 
-- 
2.53.0.473.g4a7958ca14-goog
Re: [PATCH v1] perf trace: Avoid an ERR_PTR in syscall_stats
Posted by Namhyung Kim 4 weeks ago
On Mon, 02 Mar 2026 15:45:15 -0800, Ian Rogers wrote:
> hashmap__new may return an ERR_PTR and previously this would be
> assigned to syscall_stats meaning all use of syscall_stats needs to
> test for NULL (uninitialized) or an ERR_PTR. Given the only reason
> hashmap__new can fail is ENOMEM, just use NULL to indicate the
> allocation failure and avoid the code having to test for NULL and
> IS_ERR.
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung
Re: [PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by Howard Chu 1 month ago
Hi wangguangju,

On Thu, Feb 26, 2026 at 4:23 AM <wangguangju@hygon.cn> wrote:
>
> From: wangguangju <wangguangju@hygon.cn>
>
> The alloc_syscall_stats() function always returns an error pointer
> (ERR_PTR) on failure.
>
> So replace NULL check with IS_ERR() check after calling
> delete_syscall_stats() function.

Thanks, syscall_stats is never NULL after alloc_syscall_stats(), and
it seems like perf now frees the hashmap using the error pointer,
pretty dangerous.

>
> Signed-off-by: wangguangju <wangguangju@hygon.cn>

Reviewed-by: Howard Chu <howardchu95@gmail.com>

Thanks,
Howard

> ---
>  tools/perf/builtin-trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 311d9da9896a..295b272c6c29 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1573,7 +1573,7 @@ static void delete_syscall_stats(struct hashmap *syscall_stats)
>         struct hashmap_entry *pos;
>         size_t bkt;
>
> -       if (syscall_stats == NULL)
> +       if (IS_ERR(syscall_stats))
>                 return;
>
>         hashmap__for_each_entry(syscall_stats, pos, bkt)
> --
> 2.43.0
>
>
>
Re: [PATCH] perf trace: Fix IS_ERR() vs NULL check bug
Posted by Ian Rogers 1 month ago
On Thu, Feb 26, 2026 at 7:50 AM Howard Chu <howardchu95@gmail.com> wrote:
>
> Hi wangguangju,
>
> On Thu, Feb 26, 2026 at 4:23 AM <wangguangju@hygon.cn> wrote:
> >
> > From: wangguangju <wangguangju@hygon.cn>
> >
> > The alloc_syscall_stats() function always returns an error pointer
> > (ERR_PTR) on failure.
> >
> > So replace NULL check with IS_ERR() check after calling
> > delete_syscall_stats() function.
>
> Thanks, syscall_stats is never NULL after alloc_syscall_stats(), and
> it seems like perf now frees the hashmap using the error pointer,
> pretty dangerous.
>
> >
> > Signed-off-by: wangguangju <wangguangju@hygon.cn>
>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>

Acked-by: Ian Rogers <irogers@google.com>

We've been burnt by IS_ERR many times, I think in user land we should
declare it an anti-pattern and switch to using NULL and errno.
Alternatively we could wrap functions that return an error pointer by
returning some kind of struct, so the error state is checked. I
reached out to other kernel developers; they agree but the problem in
the kernel is too large to fix. In perf 90% of the problems originate
from hashmap. Alternatively, a sparse/Coccinelle build bot or
build-test that can flag the issue would be good.

Thanks,
Ian

> Thanks,
> Howard
>
> > ---
> >  tools/perf/builtin-trace.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 311d9da9896a..295b272c6c29 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -1573,7 +1573,7 @@ static void delete_syscall_stats(struct hashmap *syscall_stats)
> >         struct hashmap_entry *pos;
> >         size_t bkt;
> >
> > -       if (syscall_stats == NULL)
> > +       if (IS_ERR(syscall_stats))
> >                 return;
> >
> >         hashmap__for_each_entry(syscall_stats, pos, bkt)
> > --
> > 2.43.0
> >
> >
> >