[PATCH v2 2/2] perf stat: Stop repeating when ref_perf_stat() returns -1

Levi Yun posted 2 patches 2 months, 2 weeks ago
There is a newer version of this series
[PATCH v2 2/2] perf stat: Stop repeating when ref_perf_stat() returns -1
Posted by Levi Yun 2 months, 2 weeks ago
Exit when run_perf_stat() returns an error to avoid continuously
repeating the same error message. It's not expected that COUNTER_FATAL
or internal errors are recoverable so there's no point in retrying.

This fixes the following flood of error messages for permission issues,
for example when perf_event_paranoid==3:
  perf stat -r 1044 -- false

  Error:
  Access to performance monitoring and observability operations is limited.
  ...
  Error:
  Access to performance monitoring and observability operations is limited.
  ...
  (repeating for 1044 times).

Signed-off-by: Levi Yun <yeoreum.yun@arm.com>
---
Changes in v2:
  - Add some comments.
---
 tools/perf/builtin-stat.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 954eb37ce7b8..0153925f2382 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2875,7 +2875,15 @@ int cmd_stat(int argc, const char **argv)
 			evlist__reset_prev_raw_counts(evsel_list);

 		status = run_perf_stat(argc, argv, run_idx);
-		if (forever && status != -1 && !interval) {
+		/*
+		 * * Meet COUNTER_FATAL situation (i.e) can't open event counter.
+		 * * In this case, there is a high chance of failure in the next attempt
+		 * * as well with the same reason. so, stop it.
+		 * */
+		if (status == -1)
+			break;
+
+		if (forever && !interval) {
 			print_counters(NULL, argc, argv);
 			perf_stat__reset_stats();
 		}
--
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
Re: [PATCH v2 2/2] perf stat: Stop repeating when ref_perf_stat() returns -1
Posted by James Clark 2 months, 2 weeks ago

On 13/09/2024 03:02, Levi Yun wrote:
> Exit when run_perf_stat() returns an error to avoid continuously
> repeating the same error message. It's not expected that COUNTER_FATAL
> or internal errors are recoverable so there's no point in retrying.
> 
> This fixes the following flood of error messages for permission issues,
> for example when perf_event_paranoid==3:
>    perf stat -r 1044 -- false
> 
>    Error:
>    Access to performance monitoring and observability operations is limited.
>    ...
>    Error:
>    Access to performance monitoring and observability operations is limited.
>    ...
>    (repeating for 1044 times).
> 
> Signed-off-by: Levi Yun <yeoreum.yun@arm.com>
> ---
> Changes in v2:
>    - Add some comments.
> ---
>   tools/perf/builtin-stat.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 954eb37ce7b8..0153925f2382 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2875,7 +2875,15 @@ int cmd_stat(int argc, const char **argv)
>   			evlist__reset_prev_raw_counts(evsel_list);
> 
>   		status = run_perf_stat(argc, argv, run_idx);
> -		if (forever && status != -1 && !interval) {
> +		/*
> +		 * * Meet COUNTER_FATAL situation (i.e) can't open event counter.
> +		 * * In this case, there is a high chance of failure in the next attempt
> +		 * * as well with the same reason. so, stop it.
> +		 * */

There's something wrong with the formatting here.

But I don't think the comment answers my question about the other return 
codes. It just states what the code does.

There are many more return -1's than just for COUNTER_FATAL, so it's not 
just that situation anyway. And in addition to that, there is -ENOMEM 
and others that aren't -1 which aren't explained that they are 
deliberately explicit retry or ignores.

> +		if (status == -1)
> +			break;
> +
> +		if (forever && !interval) {
>   			print_counters(NULL, argc, argv);
>   			perf_stat__reset_stats();
>   		}
> --
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
>
Re: [PATCH v2 2/2] perf stat: Stop repeating when ref_perf_stat() returns -1
Posted by James Clark 2 months, 2 weeks ago

On 13/09/2024 09:36, James Clark wrote:
> 
> 
> On 13/09/2024 03:02, Levi Yun wrote:
>> Exit when run_perf_stat() returns an error to avoid continuously
>> repeating the same error message. It's not expected that COUNTER_FATAL
>> or internal errors are recoverable so there's no point in retrying.
>>
>> This fixes the following flood of error messages for permission issues,
>> for example when perf_event_paranoid==3:
>>    perf stat -r 1044 -- false
>>
>>    Error:
>>    Access to performance monitoring and observability operations is 
>> limited.
>>    ...
>>    Error:
>>    Access to performance monitoring and observability operations is 
>> limited.
>>    ...
>>    (repeating for 1044 times).
>>
>> Signed-off-by: Levi Yun <yeoreum.yun@arm.com>
>> ---
>> Changes in v2:
>>    - Add some comments.
>> ---
>>   tools/perf/builtin-stat.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 954eb37ce7b8..0153925f2382 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -2875,7 +2875,15 @@ int cmd_stat(int argc, const char **argv)
>>               evlist__reset_prev_raw_counts(evsel_list);
>>
>>           status = run_perf_stat(argc, argv, run_idx);
>> -        if (forever && status != -1 && !interval) {
>> +        /*
>> +         * * Meet COUNTER_FATAL situation (i.e) can't open event 
>> counter.
>> +         * * In this case, there is a high chance of failure in the 
>> next attempt
>> +         * * as well with the same reason. so, stop it.
>> +         * */
> 
> There's something wrong with the formatting here.
> 
> But I don't think the comment answers my question about the other return 
> codes. It just states what the code does.
> 
> There are many more return -1's than just for COUNTER_FATAL, so it's not 
> just that situation anyway. And in addition to that, there is -ENOMEM 
> and others that aren't -1 which aren't explained that they are 
> deliberately explicit retry or ignores.
> 

If I'm understanding what I think it means, what about something like this:

/*
  * Returns -1 for fatal errors which signifies to not continue
  * when in repeat mode.
  *
  * Returns < -1 error codes when stat record is used. These
  * result in the stat information being displayed, but writing
  * to the file fails and is non fatal.
  */
static int __run_perf_stat(int argc, const char **argv, int run_idx)
{