[PATCH 0/4] perf ftrace: Add 'profile' subcommand (v1)

Namhyung Kim posted 4 patches 1 year, 6 months ago
tools/perf/Documentation/perf-ftrace.txt |  48 ++-
tools/perf/builtin-ftrace.c              | 439 +++++++++++++++++++++--
tools/perf/util/ftrace.h                 |   3 +
3 files changed, 463 insertions(+), 27 deletions(-)
[PATCH 0/4] perf ftrace: Add 'profile' subcommand (v1)
Posted by Namhyung Kim 1 year, 6 months ago
Hello,

This is an attempt to extend perf ftrace command to show a kernel function
profile using the function_graph tracer.  This is useful to see detailed
info like total, average, max time (in usec) and number of calls for each
function.

  $ sudo perf ftrace profile -- sync | head
  # Total (us)   Avg (us)   Max (us)      Count   Function
      7638.372   7638.372   7638.372          1   __do_sys_sync
      7638.059   7638.059   7638.059          1   ksys_sync
      5893.959   1964.653   3747.963          3   iterate_supers
      5214.181    579.353   1688.752          9   schedule
      3585.773     44.269   3537.329         81   sync_inodes_one_sb
      3566.179     44.027   3537.078         81   sync_inodes_sb
      1976.901    247.113   1968.070          8   filemap_fdatawait_keep_errors
      1974.367    246.796   1967.895          8   __filemap_fdatawait_range
      1935.407     37.219   1157.627         52   folio_wait_writeback

While the kernel also provides the similar functionality IIRC under
CONFIG_FUNCTION_PROFILER, it's often not enabled on disto kernels so I
implemented it in user space.

Also it can support function filters like 'perf ftrace trace' so users
can focus on some target functions and change the buffer size if needed.

  $ sudo perf ftrace profile -h
  
   Usage: perf ftrace [<options>] [<command>]
      or: perf ftrace [<options>] -- [<command>] [<options>]
      or: perf ftrace {trace|latency|profile} [<options>] [<command>]
      or: perf ftrace {trace|latency|profile} [<options>] -- [<command>] [<options>]
  
      -a, --all-cpus        System-wide collection from all CPUs
      -C, --cpu <cpu>       List of cpus to monitor
      -G, --graph-funcs <func>
                            Trace given functions using function_graph tracer
      -g, --nograph-funcs <func>
                            Set nograph filter on given functions
      -m, --buffer-size <size>
                            Size of per cpu buffer, needs to use a B, K, M or G suffix.
      -N, --notrace-funcs <func>
                            Do not trace given functions
      -p, --pid <pid>       Trace on existing process id
      -s, --sort <key>      Sort result by key: total (default), avg, max, count, name.
      -T, --trace-funcs <func>
                            Trace given functions using function tracer
      -v, --verbose         Be more verbose
          --tid <tid>       Trace on existing thread id (exclusive to --pid)


The code is also available in 'perf/ftrace-profile-v1' branch at
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (4):
  perf ftrace: Add 'tail' option to --graph-opts
  perf ftrace: Factor out check_ftrace_capable()
  perf ftrace: Add 'profile' command
  perf ftrace: Add -s/--sort option to profile sub-command

 tools/perf/Documentation/perf-ftrace.txt |  48 ++-
 tools/perf/builtin-ftrace.c              | 439 +++++++++++++++++++++--
 tools/perf/util/ftrace.h                 |   3 +
 3 files changed, 463 insertions(+), 27 deletions(-)

-- 
2.46.0.rc1.232.g9752f9e123-goog
Re: [PATCH 0/4] perf ftrace: Add 'profile' subcommand (v1)
Posted by Arnaldo Carvalho de Melo 1 year, 6 months ago
On Sun, Jul 28, 2024 at 05:41:23PM -0700, Namhyung Kim wrote:
> Hello,
> 
> This is an attempt to extend perf ftrace command to show a kernel function
> profile using the function_graph tracer.  This is useful to see detailed
> info like total, average, max time (in usec) and number of calls for each
> function.
> 
>   $ sudo perf ftrace profile -- sync | head
>   # Total (us)   Avg (us)   Max (us)      Count   Function
>       7638.372   7638.372   7638.372          1   __do_sys_sync
>       7638.059   7638.059   7638.059          1   ksys_sync
>       5893.959   1964.653   3747.963          3   iterate_supers
>       5214.181    579.353   1688.752          9   schedule
>       3585.773     44.269   3537.329         81   sync_inodes_one_sb
>       3566.179     44.027   3537.078         81   sync_inodes_sb
>       1976.901    247.113   1968.070          8   filemap_fdatawait_keep_errors
>       1974.367    246.796   1967.895          8   __filemap_fdatawait_range
>       1935.407     37.219   1157.627         52   folio_wait_writeback
> 
> While the kernel also provides the similar functionality IIRC under
> CONFIG_FUNCTION_PROFILER, it's often not enabled on disto kernels so I
> implemented it in user space.

Great functionality, tested it all and applied to tmp.perf-tools-next,
will be in perf-tools-next after one last round of container builds.

The discussion about libcap seems to still be open, so I'm applying what
is in this series as it is small and simple, we can go on from there.

Thanks!

- Arnaldo
 
> Also it can support function filters like 'perf ftrace trace' so users
> can focus on some target functions and change the buffer size if needed.
> 
>   $ sudo perf ftrace profile -h
>   
>    Usage: perf ftrace [<options>] [<command>]
>       or: perf ftrace [<options>] -- [<command>] [<options>]
>       or: perf ftrace {trace|latency|profile} [<options>] [<command>]
>       or: perf ftrace {trace|latency|profile} [<options>] -- [<command>] [<options>]
>   
>       -a, --all-cpus        System-wide collection from all CPUs
>       -C, --cpu <cpu>       List of cpus to monitor
>       -G, --graph-funcs <func>
>                             Trace given functions using function_graph tracer
>       -g, --nograph-funcs <func>
>                             Set nograph filter on given functions
>       -m, --buffer-size <size>
>                             Size of per cpu buffer, needs to use a B, K, M or G suffix.
>       -N, --notrace-funcs <func>
>                             Do not trace given functions
>       -p, --pid <pid>       Trace on existing process id
>       -s, --sort <key>      Sort result by key: total (default), avg, max, count, name.
>       -T, --trace-funcs <func>
>                             Trace given functions using function tracer
>       -v, --verbose         Be more verbose
>           --tid <tid>       Trace on existing thread id (exclusive to --pid)
> 
> 
> The code is also available in 'perf/ftrace-profile-v1' branch at
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> 
> Namhyung Kim (4):
>   perf ftrace: Add 'tail' option to --graph-opts
>   perf ftrace: Factor out check_ftrace_capable()
>   perf ftrace: Add 'profile' command
>   perf ftrace: Add -s/--sort option to profile sub-command
> 
>  tools/perf/Documentation/perf-ftrace.txt |  48 ++-
>  tools/perf/builtin-ftrace.c              | 439 +++++++++++++++++++++--
>  tools/perf/util/ftrace.h                 |   3 +
>  3 files changed, 463 insertions(+), 27 deletions(-)
> 
> -- 
> 2.46.0.rc1.232.g9752f9e123-goog
Re: [PATCH 0/4] perf ftrace: Add 'profile' subcommand (v1)
Posted by Ian Rogers 1 year, 6 months ago
On Tue, Jul 30, 2024 at 12:19 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Sun, Jul 28, 2024 at 05:41:23PM -0700, Namhyung Kim wrote:
> > Hello,
> >
> > This is an attempt to extend perf ftrace command to show a kernel function
> > profile using the function_graph tracer.  This is useful to see detailed
> > info like total, average, max time (in usec) and number of calls for each
> > function.
> >
> >   $ sudo perf ftrace profile -- sync | head
> >   # Total (us)   Avg (us)   Max (us)      Count   Function
> >       7638.372   7638.372   7638.372          1   __do_sys_sync
> >       7638.059   7638.059   7638.059          1   ksys_sync
> >       5893.959   1964.653   3747.963          3   iterate_supers
> >       5214.181    579.353   1688.752          9   schedule
> >       3585.773     44.269   3537.329         81   sync_inodes_one_sb
> >       3566.179     44.027   3537.078         81   sync_inodes_sb
> >       1976.901    247.113   1968.070          8   filemap_fdatawait_keep_errors
> >       1974.367    246.796   1967.895          8   __filemap_fdatawait_range
> >       1935.407     37.219   1157.627         52   folio_wait_writeback
> >
> > While the kernel also provides the similar functionality IIRC under
> > CONFIG_FUNCTION_PROFILER, it's often not enabled on disto kernels so I
> > implemented it in user space.
>
> Great functionality, tested it all and applied to tmp.perf-tools-next,
> will be in perf-tools-next after one last round of container builds.
>
> The discussion about libcap seems to still be open, so I'm applying what
> is in this series as it is small and simple, we can go on from there.

Sgtm. I did the libcap cleanup on perf-tools-next, but
tmp.perf-tools-next hasn't been merged there for 3 weeks. It'd be nice
to rebase the patch on perf-tools-next, but should I just shift to
working on tmp.perf-tools-next?

Thanks,
Ian

> Thanks!
>
> - Arnaldo
>
> > Also it can support function filters like 'perf ftrace trace' so users
> > can focus on some target functions and change the buffer size if needed.
> >
> >   $ sudo perf ftrace profile -h
> >
> >    Usage: perf ftrace [<options>] [<command>]
> >       or: perf ftrace [<options>] -- [<command>] [<options>]
> >       or: perf ftrace {trace|latency|profile} [<options>] [<command>]
> >       or: perf ftrace {trace|latency|profile} [<options>] -- [<command>] [<options>]
> >
> >       -a, --all-cpus        System-wide collection from all CPUs
> >       -C, --cpu <cpu>       List of cpus to monitor
> >       -G, --graph-funcs <func>
> >                             Trace given functions using function_graph tracer
> >       -g, --nograph-funcs <func>
> >                             Set nograph filter on given functions
> >       -m, --buffer-size <size>
> >                             Size of per cpu buffer, needs to use a B, K, M or G suffix.
> >       -N, --notrace-funcs <func>
> >                             Do not trace given functions
> >       -p, --pid <pid>       Trace on existing process id
> >       -s, --sort <key>      Sort result by key: total (default), avg, max, count, name.
> >       -T, --trace-funcs <func>
> >                             Trace given functions using function tracer
> >       -v, --verbose         Be more verbose
> >           --tid <tid>       Trace on existing thread id (exclusive to --pid)
> >
> >
> > The code is also available in 'perf/ftrace-profile-v1' branch at
> > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks,
> > Namhyung
> >
> >
> > Namhyung Kim (4):
> >   perf ftrace: Add 'tail' option to --graph-opts
> >   perf ftrace: Factor out check_ftrace_capable()
> >   perf ftrace: Add 'profile' command
> >   perf ftrace: Add -s/--sort option to profile sub-command
> >
> >  tools/perf/Documentation/perf-ftrace.txt |  48 ++-
> >  tools/perf/builtin-ftrace.c              | 439 +++++++++++++++++++++--
> >  tools/perf/util/ftrace.h                 |   3 +
> >  3 files changed, 463 insertions(+), 27 deletions(-)
> >
> > --
> > 2.46.0.rc1.232.g9752f9e123-goog
Re: [PATCH 0/4] perf ftrace: Add 'profile' subcommand (v1)
Posted by Ian Rogers 1 year, 6 months ago
On Sun, Jul 28, 2024 at 5:41 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> This is an attempt to extend perf ftrace command to show a kernel function
> profile using the function_graph tracer.  This is useful to see detailed
> info like total, average, max time (in usec) and number of calls for each
> function.
>
>   $ sudo perf ftrace profile -- sync | head
>   # Total (us)   Avg (us)   Max (us)      Count   Function
>       7638.372   7638.372   7638.372          1   __do_sys_sync
>       7638.059   7638.059   7638.059          1   ksys_sync
>       5893.959   1964.653   3747.963          3   iterate_supers
>       5214.181    579.353   1688.752          9   schedule
>       3585.773     44.269   3537.329         81   sync_inodes_one_sb
>       3566.179     44.027   3537.078         81   sync_inodes_sb
>       1976.901    247.113   1968.070          8   filemap_fdatawait_keep_errors
>       1974.367    246.796   1967.895          8   __filemap_fdatawait_range
>       1935.407     37.219   1157.627         52   folio_wait_writeback
>
> While the kernel also provides the similar functionality IIRC under
> CONFIG_FUNCTION_PROFILER, it's often not enabled on disto kernels so I
> implemented it in user space.
>
> Also it can support function filters like 'perf ftrace trace' so users
> can focus on some target functions and change the buffer size if needed.
>
>   $ sudo perf ftrace profile -h
>
>    Usage: perf ftrace [<options>] [<command>]
>       or: perf ftrace [<options>] -- [<command>] [<options>]
>       or: perf ftrace {trace|latency|profile} [<options>] [<command>]
>       or: perf ftrace {trace|latency|profile} [<options>] -- [<command>] [<options>]
>
>       -a, --all-cpus        System-wide collection from all CPUs
>       -C, --cpu <cpu>       List of cpus to monitor
>       -G, --graph-funcs <func>
>                             Trace given functions using function_graph tracer
>       -g, --nograph-funcs <func>
>                             Set nograph filter on given functions
>       -m, --buffer-size <size>
>                             Size of per cpu buffer, needs to use a B, K, M or G suffix.
>       -N, --notrace-funcs <func>
>                             Do not trace given functions
>       -p, --pid <pid>       Trace on existing process id
>       -s, --sort <key>      Sort result by key: total (default), avg, max, count, name.
>       -T, --trace-funcs <func>
>                             Trace given functions using function tracer
>       -v, --verbose         Be more verbose
>           --tid <tid>       Trace on existing thread id (exclusive to --pid)
>
>
> The code is also available in 'perf/ftrace-profile-v1' branch at
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung

Lgtm, need to think about (rebase, etc) wrt the libcap change I sent
but otherwise:
Reviewed-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> Namhyung Kim (4):
>   perf ftrace: Add 'tail' option to --graph-opts
>   perf ftrace: Factor out check_ftrace_capable()
>   perf ftrace: Add 'profile' command
>   perf ftrace: Add -s/--sort option to profile sub-command
>
>  tools/perf/Documentation/perf-ftrace.txt |  48 ++-
>  tools/perf/builtin-ftrace.c              | 439 +++++++++++++++++++++--
>  tools/perf/util/ftrace.h                 |   3 +
>  3 files changed, 463 insertions(+), 27 deletions(-)
>
> --
> 2.46.0.rc1.232.g9752f9e123-goog
>