[v1] Add Scripts for Finding Top 25 Executed Functions

[PATCH 2/3] scripts/performance: Add callgrind_top_25.py script

Posted by Ahmed Karaman 5 years, 7 months ago

Python script that prints the top 25 most executed functions in QEMU
using callgrind.

Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
 scripts/performance/callgrind_top_25.py | 95 +++++++++++++++++++++++++
 1 file changed, 95 insertions(+)
 create mode 100644 scripts/performance/callgrind_top_25.py

diff --git a/scripts/performance/callgrind_top_25.py b/scripts/performance/callgrind_top_25.py
new file mode 100644
index 0000000000..03b089a96d
--- /dev/null
+++ b/scripts/performance/callgrind_top_25.py
@@ -0,0 +1,95 @@
+#!/usr/bin/env python3
+
+#  Print the top 25 most executed functions in QEMU using callgrind.
+#  Example Usage:
+#  callgrind_top_25.py <qemu-build>/x86_64-linux-user/qemu-x86_64 executable
+#
+#  This file is a part of the project "TCG Continuous Benchmarking".
+#
+#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
+#
+#  This program is free software: you can redistribute it and/or modify
+#  it under the terms of the GNU General Public License as published by
+#  the Free Software Foundation, either version 2 of the License, or
+#  (at your option) any later version.
+#
+#  This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+#  GNU General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+import os
+import sys
+
+# Ensure sufficient arguments
+if len(sys.argv) < 3:
+    print('Insufficient Script Arguments!')
+    sys.exit(1)
+
+# Get the qemu path and the executable + its arguments
+(qemu, executable) = (sys.argv[1], ' '.join(sys.argv[2:]))
+
+# Run callgrind and callgrind_annotate
+os.system('valgrind --tool=callgrind --callgrind-out-file=callgrind.data {} {} \
+            2 > / dev / null & & callgrind_annotate callgrind.data \
+            > tmp.callgrind.data'.
+          format(qemu, executable))
+
+# Line number with the total number of instructions
+number_of_instructions_line = 20
+
+# Line number with the top function
+first_func_line = 25
+
+# callgrind_annotate output
+callgrind_data = []
+
+# Open callgrind_annotate output and store it in callgrind_data
+with open('tmp.callgrind.data', 'r') as data:
+    callgrind_data = data.readlines()
+
+# Get the total number of instructions
+total_number_of_instructions = int(
+    callgrind_data[number_of_instructions_line].split(' ')[0].replace(',', ''))
+
+# Number of functions recorded by callgrind
+number_of_functions = len(callgrind_data) - first_func_line
+
+# Limit the number of top functions to 25
+number_of_top_functions = (25 if number_of_functions >
+                           25 else number_of_instructions_line)
+
+# Store the data of the top functions in top_functions[]
+top_functions = callgrind_data[first_func_line:
+                               first_func_line + number_of_top_functions]
+# Print information headers
+print('{:>4}  {:>10}  {:<25}  {}\n{}  {}  {}  {}'.format('No.',
+                                                         'Percentage',
+                                                         'Name',
+                                                         'Source File',
+                                                         '-' * 4,
+                                                         '-' * 10,
+                                                         '-' * 25,
+                                                         '-' * 30,
+                                                         ))
+
+# Print top 25 functions
+for (index, function) in enumerate(top_functions, start=1):
+    function_data = function.split()
+    # Calculate function percentage
+    percentage = (float(function_data[0].replace(
+        ',', '')) / total_number_of_instructions) * 100
+    # Get function source path and name
+    path, name = function_data[1].split(':')
+    # Print extracted data
+    print('{:>4}  {:>9.3f}%  {:<25}  {}'.format(index,
+                                                round(percentage, 3),
+                                                name,
+                                                path))
+
+# Remove intermediate files
+os.system('rm callgrind.data tmp.callgrind.data')
-- 
2.17.1

Re: [PATCH 2/3] scripts/performance: Add callgrind_top_25.py script

Posted by Aleksandar Markovic 5 years, 7 months ago

среда, 17. јун 2020., Ahmed Karaman <ahmedkhaledkaraman@gmail.com> је
написао/ла:

> Python script that prints the top 25 most executed functions in QEMU
> using callgrind.
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---


I think you should add an example of script usage in the commit message
(even though you mention such example in the comments of the script),
together with the script output for that example.

Thanks,
Aleksandar



>  scripts/performance/callgrind_top_25.py | 95 +++++++++++++++++++++++++
>  1 file changed, 95 insertions(+)
>  create mode 100644 scripts/performance/callgrind_top_25.py
>
> diff --git a/scripts/performance/callgrind_top_25.py
> b/scripts/performance/callgrind_top_25.py
> new file mode 100644
> index 0000000000..03b089a96d
> --- /dev/null
> +++ b/scripts/performance/callgrind_top_25.py
> @@ -0,0 +1,95 @@
> +#!/usr/bin/env python3
> +
> +#  Print the top 25 most executed functions in QEMU using callgrind.
> +#  Example Usage:
> +#  callgrind_top_25.py <qemu-build>/x86_64-linux-user/qemu-x86_64
> executable
> +#
> +#  This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.
> com>
> +#
> +#  This program is free software: you can redistribute it and/or modify
> +#  it under the terms of the GNU General Public License as published by
> +#  the Free Software Foundation, either version 2 of the License, or
> +#  (at your option) any later version.
> +#
> +#  This program is distributed in the hope that it will be useful,
> +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +#  GNU General Public License for more details.
> +#
> +#  You should have received a copy of the GNU General Public License
> +#  along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import os
> +import sys
> +
> +# Ensure sufficient arguments
> +if len(sys.argv) < 3:
> +    print('Insufficient Script Arguments!')
> +    sys.exit(1)
> +
> +# Get the qemu path and the executable + its arguments
> +(qemu, executable) = (sys.argv[1], ' '.join(sys.argv[2:]))
> +
> +# Run callgrind and callgrind_annotate
> +os.system('valgrind --tool=callgrind --callgrind-out-file=callgrind.data
> {} {} \
> +            2 > / dev / null & & callgrind_annotate callgrind.data \
> +            > tmp.callgrind.data'.
> +          format(qemu, executable))
> +
> +# Line number with the total number of instructions
> +number_of_instructions_line = 20
> +
> +# Line number with the top function
> +first_func_line = 25
> +
> +# callgrind_annotate output
> +callgrind_data = []
> +
> +# Open callgrind_annotate output and store it in callgrind_data
> +with open('tmp.callgrind.data', 'r') as data:
> +    callgrind_data = data.readlines()
> +
> +# Get the total number of instructions
> +total_number_of_instructions = int(
> +    callgrind_data[number_of_instructions_line].split('
> ')[0].replace(',', ''))
> +
> +# Number of functions recorded by callgrind
> +number_of_functions = len(callgrind_data) - first_func_line
> +
> +# Limit the number of top functions to 25
> +number_of_top_functions = (25 if number_of_functions >
> +                           25 else number_of_instructions_line)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = callgrind_data[first_func_line:
> +                               first_func_line + number_of_top_functions]
> +# Print information headers
> +print('{:>4}  {:>10}  {:<25}  {}\n{}  {}  {}  {}'.format('No.',
> +                                                         'Percentage',
> +                                                         'Name',
> +                                                         'Source File',
> +                                                         '-' * 4,
> +                                                         '-' * 10,
> +                                                         '-' * 25,
> +                                                         '-' * 30,
> +                                                         ))
> +
> +# Print top 25 functions
> +for (index, function) in enumerate(top_functions, start=1):
> +    function_data = function.split()
> +    # Calculate function percentage
> +    percentage = (float(function_data[0].replace(
> +        ',', '')) / total_number_of_instructions) * 100
> +    # Get function source path and name
> +    path, name = function_data[1].split(':')
> +    # Print extracted data
> +    print('{:>4}  {:>9.3f}%  {:<25}  {}'.format(index,
> +                                                round(percentage, 3),
> +                                                name,
> +                                                path))
> +
> +# Remove intermediate files
> +os.system('rm callgrind.data tmp.callgrind.data')
> --
> 2.17.1
>
>

Re: [PATCH 2/3] scripts/performance: Add callgrind_top_25.py script

Posted by Alex Bennée 5 years, 7 months ago

Ahmed Karaman <ahmedkhaledkaraman@gmail.com> writes:

> Python script that prints the top 25 most executed functions in QEMU
> using callgrind.
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---
>  scripts/performance/callgrind_top_25.py | 95 +++++++++++++++++++++++++
>  1 file changed, 95 insertions(+)
>  create mode 100644 scripts/performance/callgrind_top_25.py
>
> diff --git a/scripts/performance/callgrind_top_25.py b/scripts/performance/callgrind_top_25.py
> new file mode 100644

You will want the script to be +x if the user is to execute it.

> index 0000000000..03b089a96d
> --- /dev/null
> +++ b/scripts/performance/callgrind_top_25.py
> @@ -0,0 +1,95 @@
> +#!/usr/bin/env python3
> +
> +#  Print the top 25 most executed functions in QEMU using callgrind.
> +#  Example Usage:
> +#  callgrind_top_25.py <qemu-build>/x86_64-linux-user/qemu-x86_64
> executable

Why limit to 25, make the name generic and maybe just default to 25
unless the user specifies a different option.

> +#
> +#  This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> +#
> +#  This program is free software: you can redistribute it and/or modify
> +#  it under the terms of the GNU General Public License as published by
> +#  the Free Software Foundation, either version 2 of the License, or
> +#  (at your option) any later version.
> +#
> +#  This program is distributed in the hope that it will be useful,
> +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +#  GNU General Public License for more details.
> +#
> +#  You should have received a copy of the GNU General Public License
> +#  along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import os
> +import sys
> +
> +# Ensure sufficient arguments
> +if len(sys.argv) < 3:
> +    print('Insufficient Script Arguments!')
> +    sys.exit(1)
> +
> +# Get the qemu path and the executable + its arguments
> +(qemu, executable) = (sys.argv[1], ' '.join(sys.argv[2:]))

I would recommend using:

  from argparse import ArgumentParser

from the start as adding options with hand parsing will be a pain. I
would suggest a specific option for the qemu binary and then using a
positional argument that can be read after -- so you don't confuse
options.

> +
> +# Run callgrind and callgrind_annotate
> +os.system('valgrind --tool=callgrind --callgrind-out-file=callgrind.data {} {} \
> +            2 > / dev / null & & callgrind_annotate callgrind.data \
> +            > tmp.callgrind.data'.
> +          format(qemu, executable))

Direct os.system calls are discouraged, you tend to get weird effects
like:

  ../../scripts/performance/callgrind_top_25.py ./aarch64-linux-user/qemu-aarch64 ./tests/tcg/aarch64-linux-user/fcvt
  sh: 1: Syntax error: "&" unexpected
  Traceback (most recent call last):
    File "../../scripts/performance/callgrind_top_25.py", line 52, in <module>
      with open('tmp.callgrind.data', 'r') as data:
  FileNotFoundError: [Errno 2] No such file or directory: 'tmp.callgrind.data'

I would:

  - check for valgrind in path and fail gracefully if not found
  - use os.subprocess API for launching (with or without the shell)

> +
> +# Line number with the total number of instructions
> +number_of_instructions_line = 20
> +
> +# Line number with the top function
> +first_func_line = 25

for example

    parser.add_argument('-n', dest="top", type=int, default=25,
                        help="Hottest n functions")

> +
> +# callgrind_annotate output
> +callgrind_data = []
> +
> +# Open callgrind_annotate output and store it in callgrind_data
> +with open('tmp.callgrind.data', 'r') as data:
> +    callgrind_data = data.readlines()
> +
> +# Get the total number of instructions
> +total_number_of_instructions = int(
> +    callgrind_data[number_of_instructions_line].split('
> ')[0].replace(',', ''))

There is no harm in having your steps split out a little.

> +
> +# Number of functions recorded by callgrind
> +number_of_functions = len(callgrind_data) - first_func_line
> +
> +# Limit the number of top functions to 25
> +number_of_top_functions = (25 if number_of_functions >
> +                           25 else number_of_instructions_line)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = callgrind_data[first_func_line:
> +                               first_func_line + number_of_top_functions]
> +# Print information headers
> +print('{:>4}  {:>10}  {:<25}  {}\n{}  {}  {}  {}'.format('No.',
> +                                                         'Percentage',
> +                                                         'Name',
> +                                                         'Source File',
> +                                                         '-' * 4,
> +                                                         '-' * 10,
> +                                                         '-' * 25,
> +                                                         '-' * 30,
> +                                                         ))
> +
> +# Print top 25 functions
> +for (index, function) in enumerate(top_functions, start=1):
> +    function_data = function.split()
> +    # Calculate function percentage
> +    percentage = (float(function_data[0].replace(
> +        ',', '')) / total_number_of_instructions) * 100
> +    # Get function source path and name
> +    path, name = function_data[1].split(':')
> +    # Print extracted data
> +    print('{:>4}  {:>9.3f}%  {:<25}  {}'.format(index,
> +                                                round(percentage, 3),
> +                                                name,
> +                                                path))
> +
> +# Remove intermediate files
> +os.system('rm callgrind.data tmp.callgrind.data')

os.unlink()

-- 
Alex Bennée

Re: [PATCH 2/3] scripts/performance: Add callgrind_top_25.py script

Posted by Ahmed Karaman 5 years, 7 months ago

Thanks Mr. Alex for your suggestions. I will send a v2 of this series
with the updates.

On Wed, Jun 17, 2020 at 2:16 PM Alex Bennée <alex.bennee@linaro.org> wrote:

> You will want the script to be +x if the user is to execute it.

Thanks for the reminder. Forgot to do this before sending the patch.

> > +#  Print the top 25 most executed functions in QEMU using callgrind.
> > +#  Example Usage:
> > +#  callgrind_top_25.py <qemu-build>/x86_64-linux-user/qemu-x86_64
> > executable
>
> Why limit to 25, make the name generic and maybe just default to 25
> unless the user specifies a different option.

Very valid suggestion. Thanks!

>
> I would recommend using:
>
>   from argparse import ArgumentParser
>
> from the start as adding options with hand parsing will be a pain. I
> would suggest a specific option for the qemu binary and then using a
> positional argument that can be read after -- so you don't confuse
> options.
>

Great, what do you think of the format below:
topN_callgrind.py -n20 -- /path/to/qemu executable -executable - arguments

> Direct os.system calls are discouraged, you tend to get weird effects
> like:
>
>   ../../scripts/performance/callgrind_top_25.py ./aarch64-linux-user/qemu-aarch64 ./tests/tcg/aarch64-linux-user/fcvt
>   sh: 1: Syntax error: "&" unexpected
>   Traceback (most recent call last):
>     File "../../scripts/performance/callgrind_top_25.py", line 52, in <module>
>       with open('tmp.callgrind.data', 'r') as data:
>   FileNotFoundError: [Errno 2] No such file or directory: 'tmp.callgrind.data'
>
> I would:
>
>   - check for valgrind in path and fail gracefully if not found
>   - use os.subprocess API for launching (with or without the shell)
>

This weird error was because of the space between "&&" and "2>/dev/null"
These were inserted by the autopep8 python formatter before committing.
When this is fixed, everything works fine, but I believe your
suggestion of using the os.subprocess is valid so I will implement it
and also check for valgrind as you've said.

> > +
> > +# Line number with the total number of instructions
> > +number_of_instructions_line = 20
> > +
> > +# Line number with the top function
> > +first_func_line = 25
>
> for example
>
>     parser.add_argument('-n', dest="top", type=int, default=25,
>                         help="Hottest n functions")

Will also use:
    parser.add_argument('command',  type=str, nargs='+',
                help="QEMU invocation to report the top functions for")
To parse all remaining arguments after "--".

> > +# Get the total number of instructions
> > +total_number_of_instructions = int(
> > +    callgrind_data[number_of_instructions_line].split('
> > ')[0].replace(',', ''))
>
> There is no harm in having your steps split out a little.

Noted!

> > +# Remove intermediate files
> > +os.system('rm callgrind.data tmp.callgrind.data')
>
> os.unlink()
>

Noted!

[PATCH 1/3] MAINTAINERS: Add 'Miscellaneous' section
[PATCH 2/3] scripts/performance: Add callgrind_top_25.py script
[PATCH 3/3] scripts/performance: Add perf_top_25.py script