[PATCH v4 0/6] Avoid parsing tracepoint format just for id

Ian Rogers posted 6 patches 2 weeks, 1 day ago
There is a newer version of this series
tools/lib/api/fs/fs.c                         |   6 +-
tools/perf/builtin-kmem.c                     |  12 +-
tools/perf/builtin-kwork.c                    |   3 +-
tools/perf/builtin-record.c                   |   2 -
tools/perf/builtin-script.c                   |   9 +-
tools/perf/builtin-trace.c                    |  79 +++++++++----
tools/perf/tests/Build                        |   6 +-
tools/perf/tests/builtin-test.c               |   2 -
tools/perf/tests/parse-events.c               |  25 +---
tools/perf/util/Build                         |   2 +-
tools/perf/util/data-convert-bt.c             |  10 +-
tools/perf/util/data-convert-json.c           |   8 +-
tools/perf/util/evsel.c                       | 110 +++++++++++++-----
tools/perf/util/evsel.h                       |   9 +-
tools/perf/util/evsel_fprintf.c               |   4 +-
tools/perf/util/parse-events.c                |  16 +--
tools/perf/util/perf_event_attr_fprintf.c     |   4 -
.../util/scripting-engines/trace-event-perl.c |   3 +-
.../scripting-engines/trace-event-python.c    |   3 +-
tools/perf/util/sort.c                        |  33 ++++--
tools/perf/util/trace-event-parse.c           |   2 +-
tools/perf/util/trace-event-scripting.c       |  10 +-
tools/perf/util/trace-event.h                 |   2 +-
23 files changed, 214 insertions(+), 146 deletions(-)
[PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Ian Rogers 2 weeks, 1 day ago
The tracepoint format isn't needed to open an event, just the id for
the config value. Refactor the use of evsel->tp_format to use an
accessor that will lazily construct its value. In evsel__newtp_idx
read the id so the config value can be set up/used.

This allows tracepoints to be used without libtraceevent in a number
of tests. Other functionality is enabled without libtracevent, such as
mapping a tracepoint id back to its name. There may be some
performance benefit to code using tracepoints but not using the format
information.

v4. Rebase due to conflict with 9ac98662dbd3 ("perf: event: Remove deadcode")
v3. Whitespace changes, Arnaldo.
v2. Add additional error checking/handling in evsel__tp_format.

Ian Rogers (6):
  tool api fs: Correctly encode errno for read/write open failures
  perf trace-event: Constify print arguments
  perf trace-event: Always build trace-event-info.c
  perf evsel: Add/use accessor for tp_format
  perf evsel: Allow evsel__newtp without libtraceevent
  perf tests: Enable tests disabled due to tracepoint parsing

 tools/lib/api/fs/fs.c                         |   6 +-
 tools/perf/builtin-kmem.c                     |  12 +-
 tools/perf/builtin-kwork.c                    |   3 +-
 tools/perf/builtin-record.c                   |   2 -
 tools/perf/builtin-script.c                   |   9 +-
 tools/perf/builtin-trace.c                    |  79 +++++++++----
 tools/perf/tests/Build                        |   6 +-
 tools/perf/tests/builtin-test.c               |   2 -
 tools/perf/tests/parse-events.c               |  25 +---
 tools/perf/util/Build                         |   2 +-
 tools/perf/util/data-convert-bt.c             |  10 +-
 tools/perf/util/data-convert-json.c           |   8 +-
 tools/perf/util/evsel.c                       | 110 +++++++++++++-----
 tools/perf/util/evsel.h                       |   9 +-
 tools/perf/util/evsel_fprintf.c               |   4 +-
 tools/perf/util/parse-events.c                |  16 +--
 tools/perf/util/perf_event_attr_fprintf.c     |   4 -
 .../util/scripting-engines/trace-event-perl.c |   3 +-
 .../scripting-engines/trace-event-python.c    |   3 +-
 tools/perf/util/sort.c                        |  33 ++++--
 tools/perf/util/trace-event-parse.c           |   2 +-
 tools/perf/util/trace-event-scripting.c       |  10 +-
 tools/perf/util/trace-event.h                 |   2 +-
 23 files changed, 214 insertions(+), 146 deletions(-)

-- 
2.47.0.277.g8800431eea-goog
Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Namhyung Kim 2 weeks ago
On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> The tracepoint format isn't needed to open an event, just the id for
> the config value. Refactor the use of evsel->tp_format to use an
> accessor that will lazily construct its value. In evsel__newtp_idx
> read the id so the config value can be set up/used.
> 
> This allows tracepoints to be used without libtraceevent in a number
> of tests. Other functionality is enabled without libtracevent, such as
> mapping a tracepoint id back to its name. There may be some
> performance benefit to code using tracepoints but not using the format
> information.
> 
> v4. Rebase due to conflict with 9ac98662dbd3 ("perf: event: Remove deadcode")
> v3. Whitespace changes, Arnaldo.
> v2. Add additional error checking/handling in evsel__tp_format.
> 
> Ian Rogers (6):
>   tool api fs: Correctly encode errno for read/write open failures
>   perf trace-event: Constify print arguments
>   perf trace-event: Always build trace-event-info.c
>   perf evsel: Add/use accessor for tp_format
>   perf evsel: Allow evsel__newtp without libtraceevent
>   perf tests: Enable tests disabled due to tracepoint parsing

After applying this series, I'm seeing some test failures.  But I don't
understand why it affects non-tracepoint events though.

  $ sudo ./perf test -v pipe
  --- start ---
  test child forked, pid 3036123
   1bde35-1bdecc l noploop
  perf does have symbol 'noploop'
  
  Record+report pipe test
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.210 MB - ]
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.517 MB - ]
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.516 MB - ]
  Record+report pipe test [Success]
  
  Inject -B build-ids test
  0xa5c [0x17a4]: failed to process type: 80
  Error:
  failed to process sample
  Inject build-ids test [Failed - cannot find noploop function in pipe #1]
  
  Inject -b build-ids test
  0xa5c [0x17a4]: failed to process type: 80
  Error:
  failed to process sample
  Inject build-ids test [Failed - cannot find noploop function in pipe #1]
  
  Inject --buildid-all build-ids test
  0xa5c [0x17a4]: failed to process type: 80
  Error:
  failed to process sample
  Inject build-ids test [Failed - cannot find noploop function in pipe #1]
  
  Inject --mmap2-buildid-all build-ids test
  0xa5c [0x17a4]: failed to process type: 80
  Error:
  failed to process sample
  Inject build-ids test [Failed - cannot find noploop function in pipe #1]
  ---- end(-1) ----
   84: perf pipe recording and injection test                          : FAILED!
  
  $ sudo ./perf test -v Zstd
  --- start ---
  test child forked, pid 3036097
  Collecting compressed record file:
  500+0 records in
  500+0 records out
  256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
  Checking compressed events stats:
  Couldn't decompress data
  0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
  Error:
  failed to process sample
  ---- end(-1) ----
   86: Zstd perf.data compression/decompression                        : FAILED!

Thanks,
Namhyung


> 
>  tools/lib/api/fs/fs.c                         |   6 +-
>  tools/perf/builtin-kmem.c                     |  12 +-
>  tools/perf/builtin-kwork.c                    |   3 +-
>  tools/perf/builtin-record.c                   |   2 -
>  tools/perf/builtin-script.c                   |   9 +-
>  tools/perf/builtin-trace.c                    |  79 +++++++++----
>  tools/perf/tests/Build                        |   6 +-
>  tools/perf/tests/builtin-test.c               |   2 -
>  tools/perf/tests/parse-events.c               |  25 +---
>  tools/perf/util/Build                         |   2 +-
>  tools/perf/util/data-convert-bt.c             |  10 +-
>  tools/perf/util/data-convert-json.c           |   8 +-
>  tools/perf/util/evsel.c                       | 110 +++++++++++++-----
>  tools/perf/util/evsel.h                       |   9 +-
>  tools/perf/util/evsel_fprintf.c               |   4 +-
>  tools/perf/util/parse-events.c                |  16 +--
>  tools/perf/util/perf_event_attr_fprintf.c     |   4 -
>  .../util/scripting-engines/trace-event-perl.c |   3 +-
>  .../scripting-engines/trace-event-python.c    |   3 +-
>  tools/perf/util/sort.c                        |  33 ++++--
>  tools/perf/util/trace-event-parse.c           |   2 +-
>  tools/perf/util/trace-event-scripting.c       |  10 +-
>  tools/perf/util/trace-event.h                 |   2 +-
>  23 files changed, 214 insertions(+), 146 deletions(-)
> 
> -- 
> 2.47.0.277.g8800431eea-goog
>
Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Ian Rogers 2 weeks ago
On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > Ian Rogers (6):
> >   tool api fs: Correctly encode errno for read/write open failures
> >   perf trace-event: Constify print arguments
> >   perf trace-event: Always build trace-event-info.c
> >   perf evsel: Add/use accessor for tp_format
> >   perf evsel: Allow evsel__newtp without libtraceevent
> >   perf tests: Enable tests disabled due to tracepoint parsing
>
> After applying this series, I'm seeing some test failures.  But I don't
> understand why it affects non-tracepoint events though.
>
>   $ sudo ./perf test -v pipe
>   --- start ---
>   test child forked, pid 3036123
>    1bde35-1bdecc l noploop
>   perf does have symbol 'noploop'
>
>   Record+report pipe test
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.210 MB - ]
>   [ perf record: Woken up 2 times to write data ]
>   [ perf record: Captured and wrote 0.517 MB - ]
>   [ perf record: Woken up 2 times to write data ]
>   [ perf record: Captured and wrote 0.516 MB - ]
>   Record+report pipe test [Success]
>
>   Inject -B build-ids test
>   0xa5c [0x17a4]: failed to process type: 80
>   Error:
>   failed to process sample
>   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
>
>   Inject -b build-ids test
>   0xa5c [0x17a4]: failed to process type: 80
>   Error:
>   failed to process sample
>   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
>
>   Inject --buildid-all build-ids test
>   0xa5c [0x17a4]: failed to process type: 80
>   Error:
>   failed to process sample
>   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
>
>   Inject --mmap2-buildid-all build-ids test
>   0xa5c [0x17a4]: failed to process type: 80
>   Error:
>   failed to process sample
>   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
>   ---- end(-1) ----
>    84: perf pipe recording and injection test                          : FAILED!
>
>   $ sudo ./perf test -v Zstd
>   --- start ---
>   test child forked, pid 3036097
>   Collecting compressed record file:
>   500+0 records in
>   500+0 records out
>   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
>   Checking compressed events stats:
>   Couldn't decompress data
>   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
>   Error:
>   failed to process sample
>   ---- end(-1) ----
>    86: Zstd perf.data compression/decompression                        : FAILED!
>
> Thanks,
> Namhyung

I'm not able to reproduce:
```
$ git log --oneline |head
a59bca6eb0a6 perf test: Add a runs-per-test flag
0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
7f57057c7884 perf evsel: Add/use accessor for tp_format
c27d357d2d4c perf trace-event: Always build trace-event-info.c
20bf7a2cd68a perf trace-event: Constify print arguments
f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
...
$ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
84: perf pipe recording and injection test                          : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
86: Zstd perf.data compression/decompression                        : Ok
```
Similarly not as root or if runs-per-test is 100.

Agreed that the changes are for tracepoints and these tests aren't for
tracepoints, so an interaction wouldn't be expected. If you have a
reliable reproduction perhaps you can bisect it.

Thanks,
Ian
Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Namhyung Kim 2 weeks ago
On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > Ian Rogers (6):
> > >   tool api fs: Correctly encode errno for read/write open failures
> > >   perf trace-event: Constify print arguments
> > >   perf trace-event: Always build trace-event-info.c
> > >   perf evsel: Add/use accessor for tp_format
> > >   perf evsel: Allow evsel__newtp without libtraceevent
> > >   perf tests: Enable tests disabled due to tracepoint parsing
> >
> > After applying this series, I'm seeing some test failures.  But I don't
> > understand why it affects non-tracepoint events though.
> >
> >   $ sudo ./perf test -v pipe
> >   --- start ---
> >   test child forked, pid 3036123
> >    1bde35-1bdecc l noploop
> >   perf does have symbol 'noploop'
> >
> >   Record+report pipe test
> >   [ perf record: Woken up 1 times to write data ]
> >   [ perf record: Captured and wrote 0.210 MB - ]
> >   [ perf record: Woken up 2 times to write data ]
> >   [ perf record: Captured and wrote 0.517 MB - ]
> >   [ perf record: Woken up 2 times to write data ]
> >   [ perf record: Captured and wrote 0.516 MB - ]
> >   Record+report pipe test [Success]
> >
> >   Inject -B build-ids test
> >   0xa5c [0x17a4]: failed to process type: 80
> >   Error:
> >   failed to process sample
> >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> >
> >   Inject -b build-ids test
> >   0xa5c [0x17a4]: failed to process type: 80
> >   Error:
> >   failed to process sample
> >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> >
> >   Inject --buildid-all build-ids test
> >   0xa5c [0x17a4]: failed to process type: 80
> >   Error:
> >   failed to process sample
> >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> >
> >   Inject --mmap2-buildid-all build-ids test
> >   0xa5c [0x17a4]: failed to process type: 80
> >   Error:
> >   failed to process sample
> >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> >   ---- end(-1) ----
> >    84: perf pipe recording and injection test                          : FAILED!
> >
> >   $ sudo ./perf test -v Zstd
> >   --- start ---
> >   test child forked, pid 3036097
> >   Collecting compressed record file:
> >   500+0 records in
> >   500+0 records out
> >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> >   [ perf record: Woken up 1 times to write data ]
> >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> >   Checking compressed events stats:
> >   Couldn't decompress data
> >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> >   Error:
> >   failed to process sample
> >   ---- end(-1) ----
> >    86: Zstd perf.data compression/decompression                        : FAILED!
> >
> > Thanks,
> > Namhyung
> 
> I'm not able to reproduce:
> ```
> $ git log --oneline |head
> a59bca6eb0a6 perf test: Add a runs-per-test flag
> 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> 7f57057c7884 perf evsel: Add/use accessor for tp_format
> c27d357d2d4c perf trace-event: Always build trace-event-info.c
> 20bf7a2cd68a perf trace-event: Constify print arguments
> f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> ...
> $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 84: perf pipe recording and injection test                          : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> 86: Zstd perf.data compression/decompression                        : Ok
> ```
> Similarly not as root or if runs-per-test is 100.
> 
> Agreed that the changes are for tracepoints and these tests aren't for
> tracepoints, so an interaction wouldn't be expected. If you have a
> reliable reproduction perhaps you can bisect it.

it says:

9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
Author: Ian Rogers <irogers@google.com>
Date:   Fri Nov 8 10:47:46 2024 -0800

    tool api fs: Correctly encode errno for read/write open failures
    
    Switch from returning -1 to -errno so that callers can determine types
    of failure.
    
    Signed-off-by: Ian Rogers <irogers@google.com>
    Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

 tools/lib/api/fs/fs.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Thanks,
Namhyung

Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Ian Rogers 1 week, 3 days ago
On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > Ian Rogers (6):
> > > >   tool api fs: Correctly encode errno for read/write open failures
> > > >   perf trace-event: Constify print arguments
> > > >   perf trace-event: Always build trace-event-info.c
> > > >   perf evsel: Add/use accessor for tp_format
> > > >   perf evsel: Allow evsel__newtp without libtraceevent
> > > >   perf tests: Enable tests disabled due to tracepoint parsing
> > >
> > > After applying this series, I'm seeing some test failures.  But I don't
> > > understand why it affects non-tracepoint events though.
> > >
> > >   $ sudo ./perf test -v pipe
> > >   --- start ---
> > >   test child forked, pid 3036123
> > >    1bde35-1bdecc l noploop
> > >   perf does have symbol 'noploop'
> > >
> > >   Record+report pipe test
> > >   [ perf record: Woken up 1 times to write data ]
> > >   [ perf record: Captured and wrote 0.210 MB - ]
> > >   [ perf record: Woken up 2 times to write data ]
> > >   [ perf record: Captured and wrote 0.517 MB - ]
> > >   [ perf record: Woken up 2 times to write data ]
> > >   [ perf record: Captured and wrote 0.516 MB - ]
> > >   Record+report pipe test [Success]
> > >
> > >   Inject -B build-ids test
> > >   0xa5c [0x17a4]: failed to process type: 80
> > >   Error:
> > >   failed to process sample
> > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > >   Inject -b build-ids test
> > >   0xa5c [0x17a4]: failed to process type: 80
> > >   Error:
> > >   failed to process sample
> > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > >   Inject --buildid-all build-ids test
> > >   0xa5c [0x17a4]: failed to process type: 80
> > >   Error:
> > >   failed to process sample
> > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > >   Inject --mmap2-buildid-all build-ids test
> > >   0xa5c [0x17a4]: failed to process type: 80
> > >   Error:
> > >   failed to process sample
> > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >   ---- end(-1) ----
> > >    84: perf pipe recording and injection test                          : FAILED!
> > >
> > >   $ sudo ./perf test -v Zstd
> > >   --- start ---
> > >   test child forked, pid 3036097
> > >   Collecting compressed record file:
> > >   500+0 records in
> > >   500+0 records out
> > >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > >   [ perf record: Woken up 1 times to write data ]
> > >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > >   Checking compressed events stats:
> > >   Couldn't decompress data
> > >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > >   Error:
> > >   failed to process sample
> > >   ---- end(-1) ----
> > >    86: Zstd perf.data compression/decompression                        : FAILED!
> > >
> > > Thanks,
> > > Namhyung
> >
> > I'm not able to reproduce:
> > ```
> > $ git log --oneline |head
> > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > 20bf7a2cd68a perf trace-event: Constify print arguments
> > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > ...
> > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 84: perf pipe recording and injection test                          : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > 86: Zstd perf.data compression/decompression                        : Ok
> > ```
> > Similarly not as root or if runs-per-test is 100.
> >
> > Agreed that the changes are for tracepoints and these tests aren't for
> > tracepoints, so an interaction wouldn't be expected. If you have a
> > reliable reproduction perhaps you can bisect it.
>
> it says:
>
> 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> Author: Ian Rogers <irogers@google.com>
> Date:   Fri Nov 8 10:47:46 2024 -0800
>
>     tool api fs: Correctly encode errno for read/write open failures
>
>     Switch from returning -1 to -errno so that callers can determine types
>     of failure.
>
>     Signed-off-by: Ian Rogers <irogers@google.com>
>     Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
>  tools/lib/api/fs/fs.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

So I tried to eye-ball/grep all callers to spot assumptions on the
return value like:
```
err = ...__read_int
if (err == -1)
```
Didn't spot anything.

It seems in the test log the record is failing so I ran this under
gdb, set breakpoints on the 3 modified functions and then looked up
the call stack to spot bad return value assumptions. Everything looks
good.
I then tried inject and report, the only file read by these functions
is /proc/sys/kernel/perf_event_paranoid as part of symbol
initialization (nit, this should probably be read lazily and the
restriction should really come from the perf.data file, not the
running system) and those calls look good.

The change is small and not critical for the series. It improves the
error message when reading the tracepoint id fails. So we could move
forward with the rest of the series, but that could be annoying for
tracepoint users.

If I had a reproducer I'd revert the 1 line change on each function to
find out which is causing the regression. Once you have that then you
can binary search to find the bad call by using some global counter
where the first 'n' calls use the new return value and the later use
the old value. You can then vary 'n' to binary search and find the bad
caller.

Is there any chance you can help diagnose this or help me to find the
reproducer?

Thanks,
Ian
Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Namhyung Kim 5 days, 2 hours ago
On Wed, Nov 13, 2024 at 10:06:13AM -0800, Ian Rogers wrote:
> On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > > Ian Rogers (6):
> > > > >   tool api fs: Correctly encode errno for read/write open failures
> > > > >   perf trace-event: Constify print arguments
> > > > >   perf trace-event: Always build trace-event-info.c
> > > > >   perf evsel: Add/use accessor for tp_format
> > > > >   perf evsel: Allow evsel__newtp without libtraceevent
> > > > >   perf tests: Enable tests disabled due to tracepoint parsing
> > > >
> > > > After applying this series, I'm seeing some test failures.  But I don't
> > > > understand why it affects non-tracepoint events though.
> > > >
> > > >   $ sudo ./perf test -v pipe
> > > >   --- start ---
> > > >   test child forked, pid 3036123
> > > >    1bde35-1bdecc l noploop
> > > >   perf does have symbol 'noploop'
> > > >
> > > >   Record+report pipe test
> > > >   [ perf record: Woken up 1 times to write data ]
> > > >   [ perf record: Captured and wrote 0.210 MB - ]
> > > >   [ perf record: Woken up 2 times to write data ]
> > > >   [ perf record: Captured and wrote 0.517 MB - ]
> > > >   [ perf record: Woken up 2 times to write data ]
> > > >   [ perf record: Captured and wrote 0.516 MB - ]
> > > >   Record+report pipe test [Success]
> > > >
> > > >   Inject -B build-ids test
> > > >   0xa5c [0x17a4]: failed to process type: 80
> > > >   Error:
> > > >   failed to process sample
> > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > >
> > > >   Inject -b build-ids test
> > > >   0xa5c [0x17a4]: failed to process type: 80
> > > >   Error:
> > > >   failed to process sample
> > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > >
> > > >   Inject --buildid-all build-ids test
> > > >   0xa5c [0x17a4]: failed to process type: 80
> > > >   Error:
> > > >   failed to process sample
> > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > >
> > > >   Inject --mmap2-buildid-all build-ids test
> > > >   0xa5c [0x17a4]: failed to process type: 80
> > > >   Error:
> > > >   failed to process sample
> > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > >   ---- end(-1) ----
> > > >    84: perf pipe recording and injection test                          : FAILED!
> > > >
> > > >   $ sudo ./perf test -v Zstd
> > > >   --- start ---
> > > >   test child forked, pid 3036097
> > > >   Collecting compressed record file:
> > > >   500+0 records in
> > > >   500+0 records out
> > > >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > > >   [ perf record: Woken up 1 times to write data ]
> > > >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > > >   Checking compressed events stats:
> > > >   Couldn't decompress data
> > > >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > > >   Error:
> > > >   failed to process sample
> > > >   ---- end(-1) ----
> > > >    86: Zstd perf.data compression/decompression                        : FAILED!
> > > >
> > > > Thanks,
> > > > Namhyung
> > >
> > > I'm not able to reproduce:
> > > ```
> > > $ git log --oneline |head
> > > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > > 20bf7a2cd68a perf trace-event: Constify print arguments
> > > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > > ...
> > > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 84: perf pipe recording and injection test                          : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > 86: Zstd perf.data compression/decompression                        : Ok
> > > ```
> > > Similarly not as root or if runs-per-test is 100.
> > >
> > > Agreed that the changes are for tracepoints and these tests aren't for
> > > tracepoints, so an interaction wouldn't be expected. If you have a
> > > reliable reproduction perhaps you can bisect it.
> >
> > it says:
> >
> > 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> > commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> > Author: Ian Rogers <irogers@google.com>
> > Date:   Fri Nov 8 10:47:46 2024 -0800
> >
> >     tool api fs: Correctly encode errno for read/write open failures
> >
> >     Switch from returning -1 to -errno so that callers can determine types
> >     of failure.
> >
> >     Signed-off-by: Ian Rogers <irogers@google.com>
> >     Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> >  tools/lib/api/fs/fs.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> So I tried to eye-ball/grep all callers to spot assumptions on the
> return value like:
> ```
> err = ...__read_int
> if (err == -1)
> ```
> Didn't spot anything.
> 
> It seems in the test log the record is failing so I ran this under
> gdb, set breakpoints on the 3 modified functions and then looked up
> the call stack to spot bad return value assumptions. Everything looks
> good.
> I then tried inject and report, the only file read by these functions
> is /proc/sys/kernel/perf_event_paranoid as part of symbol
> initialization (nit, this should probably be read lazily and the
> restriction should really come from the perf.data file, not the
> running system) and those calls look good.
> 
> The change is small and not critical for the series. It improves the
> error message when reading the tracepoint id fails. So we could move
> forward with the rest of the series, but that could be annoying for
> tracepoint users.
> 
> If I had a reproducer I'd revert the 1 line change on each function to
> find out which is causing the regression. Once you have that then you
> can binary search to find the bad call by using some global counter
> where the first 'n' calls use the new return value and the later use
> the old value. You can then vary 'n' to binary search and find the bad
> caller.
> 
> Is there any chance you can help diagnose this or help me to find the
> reproducer?

I think this depends on the system configuration.  I've debugged it
failed when it gets cpu topology:

  ...
  read int failed: /sys/devices/system/cpu/cpu112/topology/core_id (errno=2)
  read int failed: /sys/devices/system/cpu/cpu112/topology/physical_package_id (errno=2)
  read int failed: /sys/devices/system/cpu/cpu112/topology/die_id (errno=2)
  ...

Maybe it's because # online CPUs != # possible CPUs.

  $ cat /sys/devices/system/cpu/online 
  0-63
  $ cat /sys/devices/system/cpu/possible 
  0-127

There's a code like cpu__get_socket_id() to use the return value of
sysfs__read_int() directly.  And it saves the value to aggr_cpu_id which
requires exact match like in aggr_cpu_id__equal().

Thanks,
Namhyung

Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Ian Rogers 5 days, 1 hour ago
On Mon, Nov 18, 2024 at 10:31 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Nov 13, 2024 at 10:06:13AM -0800, Ian Rogers wrote:
> > On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > > > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > > > Ian Rogers (6):
> > > > > >   tool api fs: Correctly encode errno for read/write open failures
> > > > > >   perf trace-event: Constify print arguments
> > > > > >   perf trace-event: Always build trace-event-info.c
> > > > > >   perf evsel: Add/use accessor for tp_format
> > > > > >   perf evsel: Allow evsel__newtp without libtraceevent
> > > > > >   perf tests: Enable tests disabled due to tracepoint parsing
> > > > >
> > > > > After applying this series, I'm seeing some test failures.  But I don't
> > > > > understand why it affects non-tracepoint events though.
> > > > >
> > > > >   $ sudo ./perf test -v pipe
> > > > >   --- start ---
> > > > >   test child forked, pid 3036123
> > > > >    1bde35-1bdecc l noploop
> > > > >   perf does have symbol 'noploop'
> > > > >
> > > > >   Record+report pipe test
> > > > >   [ perf record: Woken up 1 times to write data ]
> > > > >   [ perf record: Captured and wrote 0.210 MB - ]
> > > > >   [ perf record: Woken up 2 times to write data ]
> > > > >   [ perf record: Captured and wrote 0.517 MB - ]
> > > > >   [ perf record: Woken up 2 times to write data ]
> > > > >   [ perf record: Captured and wrote 0.516 MB - ]
> > > > >   Record+report pipe test [Success]
> > > > >
> > > > >   Inject -B build-ids test
> > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > >   Error:
> > > > >   failed to process sample
> > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > >
> > > > >   Inject -b build-ids test
> > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > >   Error:
> > > > >   failed to process sample
> > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > >
> > > > >   Inject --buildid-all build-ids test
> > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > >   Error:
> > > > >   failed to process sample
> > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > >
> > > > >   Inject --mmap2-buildid-all build-ids test
> > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > >   Error:
> > > > >   failed to process sample
> > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > >   ---- end(-1) ----
> > > > >    84: perf pipe recording and injection test                          : FAILED!
> > > > >
> > > > >   $ sudo ./perf test -v Zstd
> > > > >   --- start ---
> > > > >   test child forked, pid 3036097
> > > > >   Collecting compressed record file:
> > > > >   500+0 records in
> > > > >   500+0 records out
> > > > >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > > > >   [ perf record: Woken up 1 times to write data ]
> > > > >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > > > >   Checking compressed events stats:
> > > > >   Couldn't decompress data
> > > > >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > > > >   Error:
> > > > >   failed to process sample
> > > > >   ---- end(-1) ----
> > > > >    86: Zstd perf.data compression/decompression                        : FAILED!
> > > > >
> > > > > Thanks,
> > > > > Namhyung
> > > >
> > > > I'm not able to reproduce:
> > > > ```
> > > > $ git log --oneline |head
> > > > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > > > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > > > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > > > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > > > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > > > 20bf7a2cd68a perf trace-event: Constify print arguments
> > > > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > > > ...
> > > > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 84: perf pipe recording and injection test                          : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > ```
> > > > Similarly not as root or if runs-per-test is 100.
> > > >
> > > > Agreed that the changes are for tracepoints and these tests aren't for
> > > > tracepoints, so an interaction wouldn't be expected. If you have a
> > > > reliable reproduction perhaps you can bisect it.
> > >
> > > it says:
> > >
> > > 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> > > commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> > > Author: Ian Rogers <irogers@google.com>
> > > Date:   Fri Nov 8 10:47:46 2024 -0800
> > >
> > >     tool api fs: Correctly encode errno for read/write open failures
> > >
> > >     Switch from returning -1 to -errno so that callers can determine types
> > >     of failure.
> > >
> > >     Signed-off-by: Ian Rogers <irogers@google.com>
> > >     Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > >
> > >  tools/lib/api/fs/fs.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > So I tried to eye-ball/grep all callers to spot assumptions on the
> > return value like:
> > ```
> > err = ...__read_int
> > if (err == -1)
> > ```
> > Didn't spot anything.
> >
> > It seems in the test log the record is failing so I ran this under
> > gdb, set breakpoints on the 3 modified functions and then looked up
> > the call stack to spot bad return value assumptions. Everything looks
> > good.
> > I then tried inject and report, the only file read by these functions
> > is /proc/sys/kernel/perf_event_paranoid as part of symbol
> > initialization (nit, this should probably be read lazily and the
> > restriction should really come from the perf.data file, not the
> > running system) and those calls look good.
> >
> > The change is small and not critical for the series. It improves the
> > error message when reading the tracepoint id fails. So we could move
> > forward with the rest of the series, but that could be annoying for
> > tracepoint users.
> >
> > If I had a reproducer I'd revert the 1 line change on each function to
> > find out which is causing the regression. Once you have that then you
> > can binary search to find the bad call by using some global counter
> > where the first 'n' calls use the new return value and the later use
> > the old value. You can then vary 'n' to binary search and find the bad
> > caller.
> >
> > Is there any chance you can help diagnose this or help me to find the
> > reproducer?
>
> I think this depends on the system configuration.  I've debugged it
> failed when it gets cpu topology:
>
>   ...
>   read int failed: /sys/devices/system/cpu/cpu112/topology/core_id (errno=2)
>   read int failed: /sys/devices/system/cpu/cpu112/topology/physical_package_id (errno=2)
>   read int failed: /sys/devices/system/cpu/cpu112/topology/die_id (errno=2)
>   ...
>
> Maybe it's because # online CPUs != # possible CPUs.
>
>   $ cat /sys/devices/system/cpu/online
>   0-63
>   $ cat /sys/devices/system/cpu/possible
>   0-127
>
> There's a code like cpu__get_socket_id() to use the return value of
> sysfs__read_int() directly.  And it saves the value to aggr_cpu_id which
> requires exact match like in aggr_cpu_id__equal().

So this is a latent bug. Are you working on the fix or asking me to do
it? I'm not sure why we should fail to describe the topology for
offline cores, but if this is a kernel restriction we should probably
purge those logical CPUs from the topology.

Thanks,
Ian
Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Namhyung Kim 4 days, 23 hours ago
On Mon, Nov 18, 2024 at 11:35:19AM -0800, Ian Rogers wrote:
> On Mon, Nov 18, 2024 at 10:31 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Nov 13, 2024 at 10:06:13AM -0800, Ian Rogers wrote:
> > > On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > > > > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > > > > Ian Rogers (6):
> > > > > > >   tool api fs: Correctly encode errno for read/write open failures
> > > > > > >   perf trace-event: Constify print arguments
> > > > > > >   perf trace-event: Always build trace-event-info.c
> > > > > > >   perf evsel: Add/use accessor for tp_format
> > > > > > >   perf evsel: Allow evsel__newtp without libtraceevent
> > > > > > >   perf tests: Enable tests disabled due to tracepoint parsing
> > > > > >
> > > > > > After applying this series, I'm seeing some test failures.  But I don't
> > > > > > understand why it affects non-tracepoint events though.
> > > > > >
> > > > > >   $ sudo ./perf test -v pipe
> > > > > >   --- start ---
> > > > > >   test child forked, pid 3036123
> > > > > >    1bde35-1bdecc l noploop
> > > > > >   perf does have symbol 'noploop'
> > > > > >
> > > > > >   Record+report pipe test
> > > > > >   [ perf record: Woken up 1 times to write data ]
> > > > > >   [ perf record: Captured and wrote 0.210 MB - ]
> > > > > >   [ perf record: Woken up 2 times to write data ]
> > > > > >   [ perf record: Captured and wrote 0.517 MB - ]
> > > > > >   [ perf record: Woken up 2 times to write data ]
> > > > > >   [ perf record: Captured and wrote 0.516 MB - ]
> > > > > >   Record+report pipe test [Success]
> > > > > >
> > > > > >   Inject -B build-ids test
> > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > >   Error:
> > > > > >   failed to process sample
> > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > >
> > > > > >   Inject -b build-ids test
> > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > >   Error:
> > > > > >   failed to process sample
> > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > >
> > > > > >   Inject --buildid-all build-ids test
> > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > >   Error:
> > > > > >   failed to process sample
> > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > >
> > > > > >   Inject --mmap2-buildid-all build-ids test
> > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > >   Error:
> > > > > >   failed to process sample
> > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > >   ---- end(-1) ----
> > > > > >    84: perf pipe recording and injection test                          : FAILED!
> > > > > >
> > > > > >   $ sudo ./perf test -v Zstd
> > > > > >   --- start ---
> > > > > >   test child forked, pid 3036097
> > > > > >   Collecting compressed record file:
> > > > > >   500+0 records in
> > > > > >   500+0 records out
> > > > > >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > > > > >   [ perf record: Woken up 1 times to write data ]
> > > > > >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > > > > >   Checking compressed events stats:
> > > > > >   Couldn't decompress data
> > > > > >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > > > > >   Error:
> > > > > >   failed to process sample
> > > > > >   ---- end(-1) ----
> > > > > >    86: Zstd perf.data compression/decompression                        : FAILED!
> > > > > >
> > > > > > Thanks,
> > > > > > Namhyung
> > > > >
> > > > > I'm not able to reproduce:
> > > > > ```
> > > > > $ git log --oneline |head
> > > > > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > > > > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > > > > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > > > > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > > > > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > > > > 20bf7a2cd68a perf trace-event: Constify print arguments
> > > > > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > > > > ...
> > > > > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > ```
> > > > > Similarly not as root or if runs-per-test is 100.
> > > > >
> > > > > Agreed that the changes are for tracepoints and these tests aren't for
> > > > > tracepoints, so an interaction wouldn't be expected. If you have a
> > > > > reliable reproduction perhaps you can bisect it.
> > > >
> > > > it says:
> > > >
> > > > 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> > > > commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> > > > Author: Ian Rogers <irogers@google.com>
> > > > Date:   Fri Nov 8 10:47:46 2024 -0800
> > > >
> > > >     tool api fs: Correctly encode errno for read/write open failures
> > > >
> > > >     Switch from returning -1 to -errno so that callers can determine types
> > > >     of failure.
> > > >
> > > >     Signed-off-by: Ian Rogers <irogers@google.com>
> > > >     Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > >
> > > >  tools/lib/api/fs/fs.c | 6 +++---
> > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > So I tried to eye-ball/grep all callers to spot assumptions on the
> > > return value like:
> > > ```
> > > err = ...__read_int
> > > if (err == -1)
> > > ```
> > > Didn't spot anything.
> > >
> > > It seems in the test log the record is failing so I ran this under
> > > gdb, set breakpoints on the 3 modified functions and then looked up
> > > the call stack to spot bad return value assumptions. Everything looks
> > > good.
> > > I then tried inject and report, the only file read by these functions
> > > is /proc/sys/kernel/perf_event_paranoid as part of symbol
> > > initialization (nit, this should probably be read lazily and the
> > > restriction should really come from the perf.data file, not the
> > > running system) and those calls look good.
> > >
> > > The change is small and not critical for the series. It improves the
> > > error message when reading the tracepoint id fails. So we could move
> > > forward with the rest of the series, but that could be annoying for
> > > tracepoint users.
> > >
> > > If I had a reproducer I'd revert the 1 line change on each function to
> > > find out which is causing the regression. Once you have that then you
> > > can binary search to find the bad call by using some global counter
> > > where the first 'n' calls use the new return value and the later use
> > > the old value. You can then vary 'n' to binary search and find the bad
> > > caller.
> > >
> > > Is there any chance you can help diagnose this or help me to find the
> > > reproducer?
> >
> > I think this depends on the system configuration.  I've debugged it
> > failed when it gets cpu topology:
> >
> >   ...
> >   read int failed: /sys/devices/system/cpu/cpu112/topology/core_id (errno=2)
> >   read int failed: /sys/devices/system/cpu/cpu112/topology/physical_package_id (errno=2)
> >   read int failed: /sys/devices/system/cpu/cpu112/topology/die_id (errno=2)
> >   ...
> >
> > Maybe it's because # online CPUs != # possible CPUs.
> >
> >   $ cat /sys/devices/system/cpu/online
> >   0-63
> >   $ cat /sys/devices/system/cpu/possible
> >   0-127
> >
> > There's a code like cpu__get_socket_id() to use the return value of
> > sysfs__read_int() directly.  And it saves the value to aggr_cpu_id which
> > requires exact match like in aggr_cpu_id__equal().
> 
> So this is a latent bug. Are you working on the fix or asking me to do
> it? I'm not sure why we should fail to describe the topology for
> offline cores, but if this is a kernel restriction we should probably
> purge those logical CPUs from the topology.

It doesn't have the topology directory for the offline CPUs.

  $ ls /sys/devices/system/cpu/cpu112/
  acpi_cppc  cpufreq  cpuidle  crash_notes  crash_notes_size  driver  firmware_node  hotplug  node0  online  power  subsystem  uevent

  $ ls /sys/devices/system/cpu/cpu112/topology
  ls: cannot access '/sys/devices/system/cpu/cpu112/topology': No such file or directory

Anyway, I'm not sure if it's really needed for this change.  Maybe you
can drop the patch 1 from the series for now and tackle it later.

Thanks,
Namhyung

Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
Posted by Ian Rogers 4 days, 22 hours ago
On Mon, Nov 18, 2024 at 1:57 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Mon, Nov 18, 2024 at 11:35:19AM -0800, Ian Rogers wrote:
> > On Mon, Nov 18, 2024 at 10:31 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Wed, Nov 13, 2024 at 10:06:13AM -0800, Ian Rogers wrote:
> > > > On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > >
> > > > > On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > > > > > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > > > > > Ian Rogers (6):
> > > > > > > >   tool api fs: Correctly encode errno for read/write open failures
> > > > > > > >   perf trace-event: Constify print arguments
> > > > > > > >   perf trace-event: Always build trace-event-info.c
> > > > > > > >   perf evsel: Add/use accessor for tp_format
> > > > > > > >   perf evsel: Allow evsel__newtp without libtraceevent
> > > > > > > >   perf tests: Enable tests disabled due to tracepoint parsing
> > > > > > >
> > > > > > > After applying this series, I'm seeing some test failures.  But I don't
> > > > > > > understand why it affects non-tracepoint events though.
> > > > > > >
> > > > > > >   $ sudo ./perf test -v pipe
> > > > > > >   --- start ---
> > > > > > >   test child forked, pid 3036123
> > > > > > >    1bde35-1bdecc l noploop
> > > > > > >   perf does have symbol 'noploop'
> > > > > > >
> > > > > > >   Record+report pipe test
> > > > > > >   [ perf record: Woken up 1 times to write data ]
> > > > > > >   [ perf record: Captured and wrote 0.210 MB - ]
> > > > > > >   [ perf record: Woken up 2 times to write data ]
> > > > > > >   [ perf record: Captured and wrote 0.517 MB - ]
> > > > > > >   [ perf record: Woken up 2 times to write data ]
> > > > > > >   [ perf record: Captured and wrote 0.516 MB - ]
> > > > > > >   Record+report pipe test [Success]
> > > > > > >
> > > > > > >   Inject -B build-ids test
> > > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > > >   Error:
> > > > > > >   failed to process sample
> > > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > > >
> > > > > > >   Inject -b build-ids test
> > > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > > >   Error:
> > > > > > >   failed to process sample
> > > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > > >
> > > > > > >   Inject --buildid-all build-ids test
> > > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > > >   Error:
> > > > > > >   failed to process sample
> > > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > > >
> > > > > > >   Inject --mmap2-buildid-all build-ids test
> > > > > > >   0xa5c [0x17a4]: failed to process type: 80
> > > > > > >   Error:
> > > > > > >   failed to process sample
> > > > > > >   Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > > > > >   ---- end(-1) ----
> > > > > > >    84: perf pipe recording and injection test                          : FAILED!
> > > > > > >
> > > > > > >   $ sudo ./perf test -v Zstd
> > > > > > >   --- start ---
> > > > > > >   test child forked, pid 3036097
> > > > > > >   Collecting compressed record file:
> > > > > > >   500+0 records in
> > > > > > >   500+0 records out
> > > > > > >   256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > > > > > >   [ perf record: Woken up 1 times to write data ]
> > > > > > >   [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > > > > > >   Checking compressed events stats:
> > > > > > >   Couldn't decompress data
> > > > > > >   0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > > > > > >   Error:
> > > > > > >   failed to process sample
> > > > > > >   ---- end(-1) ----
> > > > > > >    86: Zstd perf.data compression/decompression                        : FAILED!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Namhyung
> > > > > >
> > > > > > I'm not able to reproduce:
> > > > > > ```
> > > > > > $ git log --oneline |head
> > > > > > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > > > > > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > > > > > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > > > > > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > > > > > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > > > > > 20bf7a2cd68a perf trace-event: Constify print arguments
> > > > > > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > > > > > ...
> > > > > > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 84: perf pipe recording and injection test                          : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > 86: Zstd perf.data compression/decompression                        : Ok
> > > > > > ```
> > > > > > Similarly not as root or if runs-per-test is 100.
> > > > > >
> > > > > > Agreed that the changes are for tracepoints and these tests aren't for
> > > > > > tracepoints, so an interaction wouldn't be expected. If you have a
> > > > > > reliable reproduction perhaps you can bisect it.
> > > > >
> > > > > it says:
> > > > >
> > > > > 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> > > > > commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> > > > > Author: Ian Rogers <irogers@google.com>
> > > > > Date:   Fri Nov 8 10:47:46 2024 -0800
> > > > >
> > > > >     tool api fs: Correctly encode errno for read/write open failures
> > > > >
> > > > >     Switch from returning -1 to -errno so that callers can determine types
> > > > >     of failure.
> > > > >
> > > > >     Signed-off-by: Ian Rogers <irogers@google.com>
> > > > >     Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > > >
> > > > >  tools/lib/api/fs/fs.c | 6 +++---
> > > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > So I tried to eye-ball/grep all callers to spot assumptions on the
> > > > return value like:
> > > > ```
> > > > err = ...__read_int
> > > > if (err == -1)
> > > > ```
> > > > Didn't spot anything.
> > > >
> > > > It seems in the test log the record is failing so I ran this under
> > > > gdb, set breakpoints on the 3 modified functions and then looked up
> > > > the call stack to spot bad return value assumptions. Everything looks
> > > > good.
> > > > I then tried inject and report, the only file read by these functions
> > > > is /proc/sys/kernel/perf_event_paranoid as part of symbol
> > > > initialization (nit, this should probably be read lazily and the
> > > > restriction should really come from the perf.data file, not the
> > > > running system) and those calls look good.
> > > >
> > > > The change is small and not critical for the series. It improves the
> > > > error message when reading the tracepoint id fails. So we could move
> > > > forward with the rest of the series, but that could be annoying for
> > > > tracepoint users.
> > > >
> > > > If I had a reproducer I'd revert the 1 line change on each function to
> > > > find out which is causing the regression. Once you have that then you
> > > > can binary search to find the bad call by using some global counter
> > > > where the first 'n' calls use the new return value and the later use
> > > > the old value. You can then vary 'n' to binary search and find the bad
> > > > caller.
> > > >
> > > > Is there any chance you can help diagnose this or help me to find the
> > > > reproducer?
> > >
> > > I think this depends on the system configuration.  I've debugged it
> > > failed when it gets cpu topology:
> > >
> > >   ...
> > >   read int failed: /sys/devices/system/cpu/cpu112/topology/core_id (errno=2)
> > >   read int failed: /sys/devices/system/cpu/cpu112/topology/physical_package_id (errno=2)
> > >   read int failed: /sys/devices/system/cpu/cpu112/topology/die_id (errno=2)
> > >   ...
> > >
> > > Maybe it's because # online CPUs != # possible CPUs.
> > >
> > >   $ cat /sys/devices/system/cpu/online
> > >   0-63
> > >   $ cat /sys/devices/system/cpu/possible
> > >   0-127
> > >
> > > There's a code like cpu__get_socket_id() to use the return value of
> > > sysfs__read_int() directly.  And it saves the value to aggr_cpu_id which
> > > requires exact match like in aggr_cpu_id__equal().
> >
> > So this is a latent bug. Are you working on the fix or asking me to do
> > it? I'm not sure why we should fail to describe the topology for
> > offline cores, but if this is a kernel restriction we should probably
> > purge those logical CPUs from the topology.
>
> It doesn't have the topology directory for the offline CPUs.
>
>   $ ls /sys/devices/system/cpu/cpu112/
>   acpi_cppc  cpufreq  cpuidle  crash_notes  crash_notes_size  driver  firmware_node  hotplug  node0  online  power  subsystem  uevent
>
>   $ ls /sys/devices/system/cpu/cpu112/topology
>   ls: cannot access '/sys/devices/system/cpu/cpu112/topology': No such file or directory
>
> Anyway, I'm not sure if it's really needed for this change.  Maybe you
> can drop the patch 1 from the series for now and tackle it later.

So this would be a regression. Currently in add_tracepoint the calls look like:

add_tracepoint
|->evsel__new_tp_idx
  |-> tp_format
    |->filename__read_str

The filename__read_str error is passed in add_tracepoint for errors to
tracepoint_error that handles EACCES, ENOENT and default as separate
cases:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c#n466
The code changes filename__read_str to filename__read_int but if that
can only return -1 on error then that'll make all tracepoint open
errors look like EPERM.

Based on your instructions I was able to reproduce the problem and
have a fix to just force the error value to always be -1. I'll add
that to the series and send as v5.

Thanks,
Ian