[PATCH v2 0/3] Support .gnu_debugdata for symbols in perf

Stephen Brennan posted 3 patches 9 months, 4 weeks ago
There is a newer version of this series
tools/perf/util/compress.h   |  20 +++++++
tools/perf/util/dso.c        |   2 +
tools/perf/util/dso.h        |   1 +
tools/perf/util/lzma.c       |  29 ++++++----
tools/perf/util/symbol-elf.c | 106 ++++++++++++++++++++++++++++++++++-
tools/perf/util/symbol.c     |   2 +
6 files changed, 147 insertions(+), 13 deletions(-)
[PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Stephen Brennan 9 months, 4 weeks ago
Hello all,

This series adds the ability to read symbols from the ".gnu_debugdata" section,
an LZMA-compressed embedded ELF file which is supposed to contain additional ELF
symbols. This is something that Fedora implemented (as "MiniDebuginfo" [1]).
There are more details in v1. I've tested it with binaries that have
.gnu_debugdata, and I've also ensured that the build & runtime work when LZMA is
disabled.

[1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo

Changes since v1:
* Reuses the existing LZMA decompression helpers, rather than implementing a
  new LZMA decompression loop. This does involve creating a temporary file, but
  I think that actually makes things cleaner, since now the symsrc has a file
  descriptor to close, rather than adding a new pointer that needs freeing.
* I did also remove the pr_debug() for the case where there is no
  ".gnu_debugdata" section. That's not really an error worth logging, that's
  just normal operation.
* I added a pr_debug() for the case where we successfully load .gnu_debugdata
  so that it's easier to determine whether it gets used in tests.

v1: https://lore.kernel.org/linux-perf-users/20250213190542.3249050-1-stephen.s.brennan@oracle.com/

Stephen Brennan (3):
  tools: perf: add dummy functions for !HAVE_LZMA_SUPPORT
  tools: perf: add LZMA decompression from FILE
  tools: perf: support .gnu_debugdata for symbols

 tools/perf/util/compress.h   |  20 +++++++
 tools/perf/util/dso.c        |   2 +
 tools/perf/util/dso.h        |   1 +
 tools/perf/util/lzma.c       |  29 ++++++----
 tools/perf/util/symbol-elf.c | 106 ++++++++++++++++++++++++++++++++++-
 tools/perf/util/symbol.c     |   2 +
 6 files changed, 147 insertions(+), 13 deletions(-)

-- 
2.43.5
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Namhyung Kim 9 months, 3 weeks ago
On Thu, Feb 20, 2025 at 10:55:08AM -0800, Stephen Brennan wrote:
> Hello all,
> 
> This series adds the ability to read symbols from the ".gnu_debugdata" section,
> an LZMA-compressed embedded ELF file which is supposed to contain additional ELF
> symbols. This is something that Fedora implemented (as "MiniDebuginfo" [1]).
> There are more details in v1. I've tested it with binaries that have
> .gnu_debugdata, and I've also ensured that the build & runtime work when LZMA is
> disabled.
> 
> [1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo
> 
> Changes since v1:
> * Reuses the existing LZMA decompression helpers, rather than implementing a
>   new LZMA decompression loop. This does involve creating a temporary file, but
>   I think that actually makes things cleaner, since now the symsrc has a file
>   descriptor to close, rather than adding a new pointer that needs freeing.
> * I did also remove the pr_debug() for the case where there is no
>   ".gnu_debugdata" section. That's not really an error worth logging, that's
>   just normal operation.
> * I added a pr_debug() for the case where we successfully load .gnu_debugdata
>   so that it's easier to determine whether it gets used in tests.

Thanks, it'd be nice if anyone with a Fedora box could test this.

Thanks,
Namhyung

> 
> v1: https://lore.kernel.org/linux-perf-users/20250213190542.3249050-1-stephen.s.brennan@oracle.com/
> 
> Stephen Brennan (3):
>   tools: perf: add dummy functions for !HAVE_LZMA_SUPPORT
>   tools: perf: add LZMA decompression from FILE
>   tools: perf: support .gnu_debugdata for symbols
> 
>  tools/perf/util/compress.h   |  20 +++++++
>  tools/perf/util/dso.c        |   2 +
>  tools/perf/util/dso.h        |   1 +
>  tools/perf/util/lzma.c       |  29 ++++++----
>  tools/perf/util/symbol-elf.c | 106 ++++++++++++++++++++++++++++++++++-
>  tools/perf/util/symbol.c     |   2 +
>  6 files changed, 147 insertions(+), 13 deletions(-)
> 
> -- 
> 2.43.5
>
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Arnaldo Carvalho de Melo 9 months, 1 week ago
On Wed, Feb 26, 2025 at 02:06:28PM -0800, Namhyung Kim wrote:
> On Thu, Feb 20, 2025 at 10:55:08AM -0800, Stephen Brennan wrote:
> > Hello all,
> > 
> > This series adds the ability to read symbols from the ".gnu_debugdata" section,
> > an LZMA-compressed embedded ELF file which is supposed to contain additional ELF
> > symbols. This is something that Fedora implemented (as "MiniDebuginfo" [1]).
> > There are more details in v1. I've tested it with binaries that have
> > .gnu_debugdata, and I've also ensured that the build & runtime work when LZMA is
> > disabled.
> > 
> > [1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo
> > 
> > Changes since v1:
> > * Reuses the existing LZMA decompression helpers, rather than implementing a
> >   new LZMA decompression loop. This does involve creating a temporary file, but
> >   I think that actually makes things cleaner, since now the symsrc has a file
> >   descriptor to close, rather than adding a new pointer that needs freeing.
> > * I did also remove the pr_debug() for the case where there is no
> >   ".gnu_debugdata" section. That's not really an error worth logging, that's
> >   just normal operation.
> > * I added a pr_debug() for the case where we successfully load .gnu_debugdata
> >   so that it's easier to determine whether it gets used in tests.
> 
> Thanks, it'd be nice if anyone with a Fedora box could test this.

I'm trying to go thru this, testing with/without LZMA so that we can
show the difference in symbol resolution, etc, but I've now stumbled on
something that predates this, namely trying to build with NO_LZMA=1
isn't disabling it:

⬢ [acme@toolbox perf-tools-next]$ rm -rf /tmp/build/$(basename $PWD)/ ; mkdir /tmp/build/$(basename $PWD)/ ; make NO_LZMA=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin
make: Entering directory '/home/acme/git/perf-tools-next/tools/perf'
  BUILD:   Doing 'make -j28' parallel build

Auto-detecting system features:
...                                   libdw: [ on  ]
...                                   glibc: [ on  ]
...                                  libbfd: [ on  ]
...                          libbfd-buildid: [ on  ]
...                                  libelf: [ on  ]
...                                 libnuma: [ on  ]
...                  numa_num_possible_cpus: [ on  ]
...                                 libperl: [ on  ]
...                               libpython: [ on  ]
...                               libcrypto: [ on  ]
...                               libunwind: [ on  ]
...                             libcapstone: [ on  ]
...                               llvm-perf: [ on  ]
...                                    zlib: [ on  ]
...                                    lzma: [ on  ]
<SNIP>


⬢ [acme@toolbox perf-tools-next]$ ldd ~/bin/perf | grep lzma
	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f77ac879000)
⬢ [acme@toolbox perf-tools-next]$

my hunch is that some other feature needs lzma support and ignores the
explicit NO_LZMA=1 on the make command line when it should really be
disabling whatever features needs it, not overriding the cmd line
request.

I'm trying to investigate.

- Arnaldo
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Arnaldo Carvalho de Melo 9 months, 1 week ago
On Fri, Mar 07, 2025 at 05:10:55PM -0300, Arnaldo Carvalho de Melo wrote:
> On Wed, Feb 26, 2025 at 02:06:28PM -0800, Namhyung Kim wrote:
> > On Thu, Feb 20, 2025 at 10:55:08AM -0800, Stephen Brennan wrote:
> > > This series adds the ability to read symbols from the ".gnu_debugdata" section,
> > > an LZMA-compressed embedded ELF file which is supposed to contain additional ELF
> > > symbols. This is something that Fedora implemented (as "MiniDebuginfo" [1]).
> > > There are more details in v1. I've tested it with binaries that have
> > > .gnu_debugdata, and I've also ensured that the build & runtime work when LZMA is
> > > disabled.

> > > [1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo

> > > Changes since v1:
> > > * Reuses the existing LZMA decompression helpers, rather than implementing a
> > >   new LZMA decompression loop. This does involve creating a temporary file, but
> > >   I think that actually makes things cleaner, since now the symsrc has a file
> > >   descriptor to close, rather than adding a new pointer that needs freeing.
> > > * I did also remove the pr_debug() for the case where there is no
> > >   ".gnu_debugdata" section. That's not really an error worth logging, that's
> > >   just normal operation.
> > > * I added a pr_debug() for the case where we successfully load .gnu_debugdata
> > >   so that it's easier to determine whether it gets used in tests.
> > 
> > Thanks, it'd be nice if anyone with a Fedora box could test this.
> 
> I'm trying to go thru this, testing with/without LZMA so that we can
> show the difference in symbol resolution, etc, but I've now stumbled on
> something that predates this, namely trying to build with NO_LZMA=1
> isn't disabling it:
 
> ⬢ [acme@toolbox perf-tools-next]$ rm -rf /tmp/build/$(basename $PWD)/ ; mkdir /tmp/build/$(basename $PWD)/ ; make NO_LZMA=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin
> make: Entering directory '/home/acme/git/perf-tools-next/tools/perf'
>   BUILD:   Doing 'make -j28' parallel build
> 
> Auto-detecting system features:
> ...                                   libdw: [ on  ]
> ...                                   glibc: [ on  ]
> ...                                  libbfd: [ on  ]
> ...                          libbfd-buildid: [ on  ]
> ...                                  libelf: [ on  ]
> ...                                 libnuma: [ on  ]
> ...                  numa_num_possible_cpus: [ on  ]
> ...                                 libperl: [ on  ]
> ...                               libpython: [ on  ]
> ...                               libcrypto: [ on  ]
> ...                               libunwind: [ on  ]
> ...                             libcapstone: [ on  ]
> ...                               llvm-perf: [ on  ]
> ...                                    zlib: [ on  ]
> ...                                    lzma: [ on  ]
> <SNIP>
> 
> 
> ⬢ [acme@toolbox perf-tools-next]$ ldd ~/bin/perf | grep lzma
> 	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f77ac879000)
> ⬢ [acme@toolbox perf-tools-next]$
> 
> my hunch is that some other feature needs lzma support and ignores the
> explicit NO_LZMA=1 on the make command line when it should really be
> disabling whatever features needs it, not overriding the cmd line
> request.
> 
> I'm trying to investigate.

Its an interesting case, as this patch says, elfutils now depends on
liblzma, so:

⬢ [acme@toolbox perf-tools-next]$ rpm -qa | grep xz
xz-libs-5.4.6-3.fc40.x86_64
xz-5.4.6-3.fc40.x86_64
xz-devel-5.4.6-3.fc40.x86_64
⬢ [acme@toolbox perf-tools-next]$ rpm -e xz-devel
error: Failed dependencies:
	pkgconfig(liblzma) is needed by (installed) elfutils-devel-0.192-9.fc40.x86_64
	pkgconfig(liblzma) is needed by (installed) libxml2-devel-2.12.9-1.fc40.x86_64
	xz-devel(x86-64) is needed by (installed) libxml2-devel-2.12.9-1.fc40.x86_64
⬢ [acme@toolbox perf-tools-next]$

The NO_LZMA code in the perf build system should at this point either be
deleted, as elfutils is so critical for perf, or mean that outside of
elfutils, perf should make no use of lzma, which seems odd even with
some potentially marginal value.

So for testing this series I'll have to collect data before these
patches get applied, making sure we collect samples from symbols in
binaries with a MiniDebuginfo section, do a perf report, see them as
being not resolved after making sure we don't have its debuginfo files
installed and zapping whatever local debuginfo cache we have
(debuginfod, perfs, etc), apply the patches and then see if it gets more
symbols resolved by looking at the .gnu_debugdata section.

Ok, doing that now.

- Arnaldo
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Arnaldo Carvalho de Melo 9 months, 1 week ago
On Fri, Mar 07, 2025 at 05:18:36PM -0300, Arnaldo Carvalho de Melo wrote:
> The NO_LZMA code in the perf build system should at this point either be
> deleted, as elfutils is so critical for perf, or mean that outside of
> elfutils, perf should make no use of lzma, which seems odd even with
> some potentially marginal value.
 
> So for testing this series I'll have to collect data before these
> patches get applied, making sure we collect samples from symbols in
> binaries with a MiniDebuginfo section, do a perf report, see them as
> being not resolved after making sure we don't have its debuginfo files
> installed and zapping whatever local debuginfo cache we have
> (debuginfod, perfs, etc), apply the patches and then see if it gets more
> symbols resolved by looking at the .gnu_debugdata section.
> 
> Ok, doing that now.

Works:

⬢ [acme@toolbox perf-tools-next]$ taskset -c 0 perf record -e cpu_core/cycles/P find . > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (163 samples) ]
⬢ [acme@toolbox perf-tools-next]$ perf report --stdio > before
⬢ [acme@toolbox perf-tools-next]$ 

Apply the patches and:

⬢ [acme@toolbox perf-tools-next]$ perf report --stdio > after
⬢ [acme@toolbox perf-tools-next]$ diff -u before after
--- before	2025-03-07 17:33:15.113447391 -0300
+++ after	2025-03-07 17:33:39.291525994 -0300
@@ -9,88 +9,56 @@
 # Overhead  Command  Shared Object         Symbol                            
 # ........  .......  ....................  ..................................
 #
+     8.72%  find     find                  [.] consider_visiting
      7.90%  find     libc.so.6             [.] __GI___readdir64
      7.44%  find     libc.so.6             [.] _int_malloc
+     7.06%  find     find                  [.] find
+     6.20%  find     find                  [.] fts_build.constprop.0
      6.18%  find     libc.so.6             [.] __memmove_avx_unaligned_erms
+     4.36%  find     find                  [.] pred_print
      4.14%  find     libc.so.6             [.] __printf_buffer
      3.65%  find     libc.so.6             [.] __strlen_avx2
      3.35%  find     libc.so.6             [.] malloc
-     2.65%  find     find                  [.] 0x000000000000b498
+     2.85%  find     find                  [.] fts_alloc
      2.51%  find     libc.so.6             [.] __vfprintf_internal
      2.45%  find     libc.so.6             [.] __fprintf_chk
-     2.45%  find     find                  [.] 0x00000000000089e3
      2.33%  find     libc.so.6             [.] __printf_buffer_write
      2.13%  find     libc.so.6             [.] _int_free_merge_chunk
      1.88%  find     libc.so.6             [.] __printf_buffer_flush_to_file
-     1.87%  find     find                  [.] 0x000000000000bf8e
      1.79%  find     libc.so.6             [.] _int_free
      1.64%  find     libc.so.6             [.] msort_with_tmp.part.0
      1.63%  find     find                  [.] free@plt
-     1.34%  find     find                  [.] 0x000000000000c214
-     1.30%  find     find                  [.] 0x000000000001ea34
-     1.27%  find     find                  [.] 0x000000000001ea96
+     1.29%  find     find                  [.] fts_safe_changedir.lto_priv.0
      1.26%  find     [unknown]             [k] 0xffffffffad4001c8
      1.25%  find     libc.so.6             [.] __libc_fcntl64
      1.23%  find     libc.so.6             [.] _int_free_create_chunk
-     1.22%  find     find                  [.] 0x000000000000bfb9
-     1.22%  find     find                  [.] 0x000000000000bbde
-     1.22%  find     find                  [.] 0x000000000000b4a2
-     1.20%  find     find                  [.] 0x0000000000006918
+     1.20%  find     find                  [.] pred_and
      1.16%  find     libc.so.6             [.] __fcntl64_nocancel_adjusted
+     1.15%  find     find                  [.] AD_hash
      1.12%  find     libc.so.6             [.] cfree@GLIBC_2.2.5
      1.05%  find     libc.so.6             [.] __strchrnul_ifunc@plt
      1.03%  find     libc.so.6             [.] __libc_openat64
      1.01%  find     libc.so.6             [.] __strchrnul_avx2
-     0.69%  find     find                  [.] 0x0000000000008a0e
-     0.68%  find     find                  [.] 0x000000000000b553
-     0.67%  find     find                  [.] 0x000000000001ea63
-     0.67%  find     find                  [.] 0x0000000000006869
-     0.65%  find     find                  [.] 0x0000000000019e82
-     0.65%  find     find                  [.] 0x000000000000bbc5
-     0.65%  find     find                  [.] 0x000000000001117e
-     0.64%  find     find                  [.] 0x0000000000019fc6
-     0.64%  find     find                  [.] 0x000000000001111c
-     0.63%  find     find                  [.] 0x0000000000008a19
-     0.63%  find     find                  [.] 0x0000000000018b3d
-     0.63%  find     find                  [.] 0x000000000000b61e
+     0.97%  find     find                  [.] leave_dir.lto_priv.0
+     0.67%  find     find                  [.] apply_predicate
+     0.63%  find     find                  [.] cwd_advance_fd.lto_priv.0
      0.63%  find     libc.so.6             [.] __GI___fstatat64
-     0.63%  find     find                  [.] 0x000000000001f0de
      0.63%  find     libc.so.6             [.] __fstat64
-     0.63%  find     find                  [.] 0x000000000001edfb
-     0.62%  find     find                  [.] 0x000000000001113f
-     0.61%  find     find                  [.] 0x000000000000c223
-     0.61%  find     find                  [.] 0x000000000000c06b
-     0.61%  find     find                  [.] 0x000000000000fd90
-     0.61%  find     find                  [.] 0x0000000000018d98
-     0.60%  find     find                  [.] 0x0000000000017cfa
-     0.60%  find     find                  [.] 0x000000000001e990
-     0.60%  find     find                  [.] 0x000000000000b657
+     0.60%  find     find                  [.] rpl_fcntl
      0.59%  find     find                  [.] malloc@plt
-     0.59%  find     find                  [.] 0x000000000000c099
-     0.59%  find     find                  [.] 0x00000000000089d9
      0.58%  find     ld-linux-x86-64.so.2  [.] _dl_process_pt_gnu_property
      0.57%  find     libc.so.6             [.] unlink_chunk.isra.0
-     0.56%  find     find                  [.] 0x000000000001ea4e
-     0.56%  find     find                  [.] 0x000000000000b64b
      0.56%  find     libc.so.6             [.] malloc@plt
-     0.54%  find     find                  [.] 0x00000000000110e6
-     0.54%  find     find                  [.] 0x000000000001ead0
-     0.54%  find     find                  [.] 0x000000000000fdc7
-     0.53%  find     find                  [.] 0x000000000000fd8a
-     0.52%  find     find                  [.] 0x0000000000011e07
-     0.52%  find     find                  [.] 0x000000000000b6a8
-     0.48%  find     find                  [.] 0x0000000000012463
+     0.54%  find     find                  [.] fts_compare_ino
+     0.52%  find     find                  [.] hash_find_entry
+     0.48%  find     find                  [.] fts_sort
      0.47%  find     libc.so.6             [.] __printf_buffer_to_file_switch
      0.42%  find     libc.so.6             [.] alloc_perturb
-     0.42%  find     find                  [.] 0x000000000000bfc2
-     0.41%  find     find                  [.] 0x0000000000011179
-     0.40%  find     find                  [.] 0x000000000000c234
-     0.36%  find     find                  [.] 0x0000000000018cc0
      0.14%  find     ld-linux-x86-64.so.2  [.] _dl_sysdep_parse_arguments
      0.01%  find     ld-linux-x86-64.so.2  [.] _dl_start
      0.00%  find     ld-linux-x86-64.so.2  [.] _start
 
 
 #
-# (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
+# (Tip: To see callchains in a more compact form: perf report -g folded)
 #
⬢ [acme@toolbox perf-tools-next]$

⬢ [acme@toolbox perf-tools-next]$ find ~/.debug/ -name af3f04d1b31abc9e5ce8428110e424fd980a37
⬢ [acme@toolbox perf-tools-next]$ find ~/.cache/ -name af3f04d1b31abc9e5ce8428110e424fd980a37
⬢ [acme@toolbox perf-tools-next]$ 
⬢ [acme@toolbox perf-tools-next]$ rpm -qf /bin/find
findutils-4.9.0-9.fc40.x86_64
⬢ [acme@toolbox perf-tools-next]$ rpm -q findutils-debuginfo
package findutils-debuginfo is not installed
⬢ [acme@toolbox perf-tools-next]$

And /bin/find has only unresolved symbols in its symtabs:

⬢ [acme@toolbox perf-tools-next]$ readelf -sW /bin/find | grep -w FUNC | wc -l
145
⬢ [acme@toolbox perf-tools-next]$ readelf -sW /bin/find | grep -w FUNC | grep -vw UND
⬢ [acme@toolbox perf-tools-next]$

⬢ [acme@toolbox perf-tools-next]$ readelf -SW /bin/find  | grep SYM
  [ 7] .dynsym           DYNSYM          00000000000004a0 0004a0 000ed0 18   A  8   1  8
  [ 9] .gnu.version      VERSYM          00000000000019a0 0019a0 00013c 02   A  7   0  2
⬢ [acme@toolbox perf-tools-next]$

And that matches eu-readelf output, almost the same (UND => UNDEF):

⬢ [acme@toolbox perf-tools-next]$ eu-readelf -s /bin/find | grep -w FUNC | wc -l
145
⬢ [acme@toolbox perf-tools-next]$
⬢ [acme@toolbox perf-tools-next]$ eu-readelf -s /bin/find | grep -w FUNC | grep -vw UNDEF
⬢ [acme@toolbox perf-tools-next]$

It has a way to use that section tho:

⬢ [acme@toolbox perf-tools-next]$ man eu-readelf | grep -A2 -- --elf-section
               [--elf-section [section] ]
               [-w|
                --debug-dump[=line,=decodedline,=info,=info+,=abbrev,=pubnames,=aranges,=macro,=frames,=str,=loc,=ranges,=gdb_index,=addr]]
--
       --elf-section [section]
           Use the named SECTION (default .gnu_debugdata) as (compressed) ELF input data

⬢ [acme@toolbox perf-tools-next]$

⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | wc -l
339
⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | head
    1: 00000000000056d0     35 FUNC    LOCAL  DEFAULT       17 entry_hashfunc
    2: 0000000000005700     34 FUNC    LOCAL  DEFAULT       17 entry_comparator
    3: 0000000000005920    121 FUNC    LOCAL  DEFAULT       17 subtree_has_side_effects
    4: 00000000000059a0    992 FUNC    LOCAL  DEFAULT       17 worst_cost.part.0
    5: 0000000000005d80    449 FUNC    LOCAL  DEFAULT       17 traverse_tree
    6: 0000000000005f50     73 FUNC    LOCAL  DEFAULT       17 undangle_file_pointers
    7: 0000000000005fa0     72 FUNC    LOCAL  DEFAULT       17 looks_like_expression
    8: 0000000000006030    303 FUNC    LOCAL  DEFAULT       17 get_fts_info_name
    9: 0000000000006190     35 FUNC    LOCAL  DEFAULT       17 inside_dir.part.0
   10: 0000000000006330    451 FUNC    LOCAL  DEFAULT       17 pred_sanity_check
⬢ [acme@toolbox perf-tools-next]$

So there we can find the new entries, such as the top one in the example
profile session above:

⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | grep -w consider_visiting
   48: 000000000000b460   2544 FUNC    LOCAL  DEFAULT       17 consider_visiting
⬢ [acme@toolbox perf-tools-next]$

And trat address matches the resolution perf did with your patches:

⬢ [acme@toolbox perf-tools-next]$ perf report -v --stdio |& head
build id event received for [vdso]: a2184b81fbbc08eff401d16259eca8ad5f9d8988 [20]
build id event received for /usr/bin/find: 3faf3f04d1b31abc9e5ce8428110e424fd980a37 [20]
build id event received for /usr/lib64/ld-linux-x86-64.so.2: 765f7ab0f3569ffe98de85864a0cedda9b686994 [20]
build id event received for /usr/lib64/libc.so.6: c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba [20]
build id event received for [kernel.kallsyms]: c3fbb7df4dfb94762b1648bc65e4363e50f45585 [20]
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/find
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
⬢ [acme@toolbox perf-tools-next]$ perf report -v --stdio |& head -20
build id event received for [vdso]: a2184b81fbbc08eff401d16259eca8ad5f9d8988 [20]
build id event received for /usr/bin/find: 3faf3f04d1b31abc9e5ce8428110e424fd980a37 [20]
build id event received for /usr/lib64/ld-linux-x86-64.so.2: 765f7ab0f3569ffe98de85864a0cedda9b686994 [20]
build id event received for /usr/lib64/libc.so.6: c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba [20]
build id event received for [kernel.kallsyms]: c3fbb7df4dfb94762b1648bc65e4363e50f45585 [20]
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/find
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 163  of event 'cpu_core/cycles/Pu'
# Event count (approx.): 68126524
#
# Overhead  Command  Shared Object                    Symbol                                                 
# ........  .......  ...............................  .......................................................
#
     8.72%  find     /usr/bin/find                    0xb498             
     7.90%  find     /usr/lib64/libc.so.6             0xe51e0            B [.] __GI___readdir64
     7.44%  find     /usr/lib64/libc.so.6             0xa77cd            B [.] _int_malloc
⬢ [acme@toolbox perf-tools-next]$

The only strange thing was not having it resolved in the -v case, which
I think its because you added a new type of DSO but didn't update the
code that does the 'perf report -v' verbose case?

I ran out of time, have to go AFK now, can you please take a look,
Stephen?

DSO_BINARY_TYPE__GNU_DEBUGDATA should be handled at...

int dso__read_binary_type_filename(const struct dso *dso,
                                   enum dso_binary_type type,
                                   char *root_dir, char *filename, size_t size)

But you have it there, ok, I'll try to continue later.

Other than that the patch looks great and makes use of this new mini
symtab, excellent!

- Arnaldo
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Stephen Brennan 9 months, 1 week ago
Arnaldo Carvalho de Melo <acme@kernel.org> writes:
> On Fri, Mar 07, 2025 at 05:18:36PM -0300, Arnaldo Carvalho de Melo wrote:
[...]
> It has a way to use that section tho:
>
> ⬢ [acme@toolbox perf-tools-next]$ man eu-readelf | grep -A2 -- --elf-section
>                [--elf-section [section] ]
>                [-w|
>                 --debug-dump[=line,=decodedline,=info,=info+,=abbrev,=pubnames,=aranges,=macro,=frames,=str,=loc,=ranges,=gdb_index,=addr]]
> --
>        --elf-section [section]
>            Use the named SECTION (default .gnu_debugdata) as (compressed) ELF input data
>
> ⬢ [acme@toolbox perf-tools-next]$
>
> ⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | wc -l
> 339
> ⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | head
>     1: 00000000000056d0     35 FUNC    LOCAL  DEFAULT       17 entry_hashfunc
>     2: 0000000000005700     34 FUNC    LOCAL  DEFAULT       17 entry_comparator
>     3: 0000000000005920    121 FUNC    LOCAL  DEFAULT       17 subtree_has_side_effects
>     4: 00000000000059a0    992 FUNC    LOCAL  DEFAULT       17 worst_cost.part.0
>     5: 0000000000005d80    449 FUNC    LOCAL  DEFAULT       17 traverse_tree
>     6: 0000000000005f50     73 FUNC    LOCAL  DEFAULT       17 undangle_file_pointers
>     7: 0000000000005fa0     72 FUNC    LOCAL  DEFAULT       17 looks_like_expression
>     8: 0000000000006030    303 FUNC    LOCAL  DEFAULT       17 get_fts_info_name
>     9: 0000000000006190     35 FUNC    LOCAL  DEFAULT       17 inside_dir.part.0
>    10: 0000000000006330    451 FUNC    LOCAL  DEFAULT       17 pred_sanity_check
> ⬢ [acme@toolbox perf-tools-next]$

Wow, thank you for teaching me that!
I had been using:

  gdb /usr/bin/bash --batch -ex 'maint print msymbols'

Because I knew GDB had support for .gnu_debugdata. But the --elf-section
argument to eu-readelf is much more useful.

> So there we can find the new entries, such as the top one in the example
> profile session above:
>
> ⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | grep -w consider_visiting
>    48: 000000000000b460   2544 FUNC    LOCAL  DEFAULT       17 consider_visiting
> ⬢ [acme@toolbox perf-tools-next]$
>
> And trat address matches the resolution perf did with your patches:
>
> ⬢ [acme@toolbox perf-tools-next]$ perf report -v --stdio |& head
> build id event received for [vdso]: a2184b81fbbc08eff401d16259eca8ad5f9d8988 [20]
> build id event received for /usr/bin/find: 3faf3f04d1b31abc9e5ce8428110e424fd980a37 [20]
> build id event received for /usr/lib64/ld-linux-x86-64.so.2: 765f7ab0f3569ffe98de85864a0cedda9b686994 [20]
> build id event received for /usr/lib64/libc.so.6: c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba [20]
> build id event received for [kernel.kallsyms]: c3fbb7df4dfb94762b1648bc65e4363e50f45585 [20]
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/find
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> ⬢ [acme@toolbox perf-tools-next]$ perf report -v --stdio |& head -20
> build id event received for [vdso]: a2184b81fbbc08eff401d16259eca8ad5f9d8988 [20]
> build id event received for /usr/bin/find: 3faf3f04d1b31abc9e5ce8428110e424fd980a37 [20]
> build id event received for /usr/lib64/ld-linux-x86-64.so.2: 765f7ab0f3569ffe98de85864a0cedda9b686994 [20]
> build id event received for /usr/lib64/libc.so.6: c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba [20]
> build id event received for [kernel.kallsyms]: c3fbb7df4dfb94762b1648bc65e4363e50f45585 [20]
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/find
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 163  of event 'cpu_core/cycles/Pu'
> # Event count (approx.): 68126524
> #
> # Overhead  Command  Shared Object                    Symbol
> # ........  .......  ...............................  .......................................................
> #
>      8.72%  find     /usr/bin/find                    0xb498
>      7.90%  find     /usr/lib64/libc.so.6             0xe51e0            B [.] __GI___readdir64
>      7.44%  find     /usr/lib64/libc.so.6             0xa77cd            B [.] _int_malloc
> ⬢ [acme@toolbox perf-tools-next]$
>
> The only strange thing was not having it resolved in the -v case, which
> I think its because you added a new type of DSO but didn't update the
> code that does the 'perf report -v' verbose case?
>
> I ran out of time, have to go AFK now, can you please take a look,
> Stephen?

Thanks for the catch. I double checked all the places where
DSO_BINARY_TYPE constants are enumerated, and it turns out I missed
adding an entry to

char dso__symtab_origin(const struct dso *dso) ...

I assume that the array defaulted to '\0' which terminated the string
too early for this line. Oops!

Most of the letters I would associate with ".gnu_debugdata" are
taken (namely, g/G for GNU, m/M for MiniDebugInfo, d/D for
debugdata...). So 'n', for the second letter of GNU, is my selection
unless you feel differently. With that change, the table is fixed for
"perf report -v". Here it is running against my test data focusing on a
symbol only found in .gnu_debugdata of bash:

$ ./perf report -v --stdio -i ~/repos/UEK6/perf.data 2>&1 | egrep yy_readline_get\|gnu_debugdata
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/bash
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sed
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/date
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sqlite3
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sleep
     0.20%     0.00%  bash     /usr/bin/bash                    0x55fdc4509dbe     n [.] yy_readline_get

I'll update the patch accordingly.

> DSO_BINARY_TYPE__GNU_DEBUGDATA should be handled at...
>
> int dso__read_binary_type_filename(const struct dso *dso,
>                                    enum dso_binary_type type,
>                                    char *root_dir, char *filename, size_t size)
>
> But you have it there, ok, I'll try to continue later.
>
> Other than that the patch looks great and makes use of this new mini
> symtab, excellent!

And thank you for your testing!

> - Arnaldo
Re: [PATCH v2 0/3] Support .gnu_debugdata for symbols in perf
Posted by Arnaldo Carvalho de Melo 9 months, 1 week ago
On Fri, Mar 07, 2025 at 02:33:01PM -0800, Stephen Brennan wrote:
> Arnaldo Carvalho de Melo <acme@kernel.org> writes:
> > On Fri, Mar 07, 2025 at 05:18:36PM -0300, Arnaldo Carvalho de Melo wrote:
> [...]
> > It has a way to use that section tho:

> > ⬢ [acme@toolbox perf-tools-next]$ man eu-readelf | grep -A2 -- --elf-section
> >                [--elf-section [section] ]
> >                [-w|
> >                 --debug-dump[=line,=decodedline,=info,=info+,=abbrev,=pubnames,=aranges,=macro,=frames,=str,=loc,=ranges,=gdb_index,=addr]]
> > --
> >        --elf-section [section]
> >            Use the named SECTION (default .gnu_debugdata) as (compressed) ELF input data

> > ⬢ [acme@toolbox perf-tools-next]$

> > ⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | wc -l
> > 339
> > ⬢ [acme@toolbox perf-tools-next]$ eu-readelf --elf-section -s /bin/find | grep -w FUNC | grep -vw UNDEF | head
> >     1: 00000000000056d0     35 FUNC    LOCAL  DEFAULT       17 entry_hashfunc
> >     2: 0000000000005700     34 FUNC    LOCAL  DEFAULT       17 entry_comparator
> >     3: 0000000000005920    121 FUNC    LOCAL  DEFAULT       17 subtree_has_side_effects
> >     4: 00000000000059a0    992 FUNC    LOCAL  DEFAULT       17 worst_cost.part.0
> >     5: 0000000000005d80    449 FUNC    LOCAL  DEFAULT       17 traverse_tree
> >     6: 0000000000005f50     73 FUNC    LOCAL  DEFAULT       17 undangle_file_pointers
> >     7: 0000000000005fa0     72 FUNC    LOCAL  DEFAULT       17 looks_like_expression
> >     8: 0000000000006030    303 FUNC    LOCAL  DEFAULT       17 get_fts_info_name
> >     9: 0000000000006190     35 FUNC    LOCAL  DEFAULT       17 inside_dir.part.0
> >    10: 0000000000006330    451 FUNC    LOCAL  DEFAULT       17 pred_sanity_check
> > ⬢ [acme@toolbox perf-tools-next]$
 
> Wow, thank you for teaching me that!
> I had been using:
 
>   gdb /usr/bin/bash --batch -ex 'maint print msymbols'
 
> Because I knew GDB had support for .gnu_debugdata. But the --elf-section
> argument to eu-readelf is much more useful.

That was a nice assumption and it did the work for you :-)

I thought that the elfutils guys would add something to eu-readelf for
them to dump the compressed data in human readable for, looked at the
man page and voila! 

<SNIP>
 
> > # Overhead  Command  Shared Object                    Symbol
> > # ........  .......  ...............................  .......................................................
> > #
> >      8.72%  find     /usr/bin/find                    0xb498
> >      7.90%  find     /usr/lib64/libc.so.6             0xe51e0            B [.] __GI___readdir64
> >      7.44%  find     /usr/lib64/libc.so.6             0xa77cd            B [.] _int_malloc
> > ⬢ [acme@toolbox perf-tools-next]$

> > The only strange thing was not having it resolved in the -v case, which
> > I think its because you added a new type of DSO but didn't update the
> > code that does the 'perf report -v' verbose case?

> > I ran out of time, have to go AFK now, can you please take a look,
> > Stephen?
 
> Thanks for the catch. I double checked all the places where
> DSO_BINARY_TYPE constants are enumerated, and it turns out I missed
> adding an entry to
 
> char dso__symtab_origin(const struct dso *dso) ...
> 
> I assume that the array defaulted to '\0' which terminated the string
> too early for this line. Oops!
> 
> Most of the letters I would associate with ".gnu_debugdata" are
> taken (namely, g/G for GNU, m/M for MiniDebugInfo, d/D for
> debugdata...). So 'n', for the second letter of GNU, is my selection
> unless you feel differently. With that change, the table is fixed for
> "perf report -v". Here it is running against my test data focusing on a
> symbol only found in .gnu_debugdata of bash:
> 
> $ ./perf report -v --stdio -i ~/repos/UEK6/perf.data 2>&1 | egrep yy_readline_get\|gnu_debugdata
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/bash
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> unwind: yy_readline_get:ip = 0x55fdc4509dbe (0x33dbe)
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sed
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/date
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sqlite3
> read_gnu_debugdata: using .gnu_debugdata of /usr/bin/sleep
>      0.20%     0.00%  bash     /usr/bin/bash                    0x55fdc4509dbe     n [.] yy_readline_get
> 
> I'll update the patch accordingly.

I just tested it, works as expected:

⬢ [acme@toolbox perf-tools-next]$ perf report -v --stdio |& head -20
build id event received for [vdso]: a2184b81fbbc08eff401d16259eca8ad5f9d8988 [20]
build id event received for /usr/bin/find: 3faf3f04d1b31abc9e5ce8428110e424fd980a37 [20]
build id event received for /usr/lib64/ld-linux-x86-64.so.2: 765f7ab0f3569ffe98de85864a0cedda9b686994 [20]
build id event received for /usr/lib64/libc.so.6: c8c3fa52aaee3f5d73b6fd862e39e9d4c010b6ba [20]
build id event received for [kernel.kallsyms]: c3fbb7df4dfb94762b1648bc65e4363e50f45585 [20]
read_gnu_debugdata: using .gnu_debugdata of /usr/bin/find
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 163  of event 'cpu_core/cycles/Pu'
# Event count (approx.): 68126524
#
# Overhead  Command  Shared Object                    Symbol                                                 
# ........  .......  ...............................  .......................................................
#
     8.72%  find     /usr/bin/find                    0xb498             n [.] consider_visiting
     7.90%  find     /usr/lib64/libc.so.6             0xe51e0            B [.] __GI___readdir64
     7.44%  find     /usr/lib64/libc.so.6             0xa77cd            B [.] _int_malloc
⬢ [acme@toolbox perf-tools-next]$

I'll reply to the v3 thread with my Tested-by.

Thanks!

- Arnaldo