[PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform

James Clark posted 2 patches 1 year, 2 months ago
[PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by James Clark 1 year, 2 months ago
Since the linked fixes: commit, specifying a CPU on hybrid platforms
results in an error because Perf tries to open an extended type event
on "any" CPU which isn't valid. Extended type events can only be opened
on CPUs that match the type.

Before (working):

  $ perf record --cpu 1 -- true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]

After (not working):

  $ perf record -C 1 -- true
  WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
  /bin/dmesg | grep -i perf may provide additional information.

(Ignore the warning message, that's expected and not particularly
relevant to this issue).

This is because perf_cpu_map__intersect() of the user specified CPU (1)
and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
CPU map. However for the purposes of opening an event, libperf converts
empty CPU maps into an any CPU (-1) which the kernel rejects.

Fix it by deleting evsels with empty CPU maps in the specific case where
user requested CPU maps are evaluated.

Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 tools/lib/perf/evlist.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index c6d67fc9e57e..83c43dc13313 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		 */
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
+
+		/*
+		 * Empty cpu lists would eventually get opened as "any" so remove
+		 * genuinely empty ones before they're opened in the wrong place.
+		 */
+		if (perf_cpu_map__is_empty(evsel->cpus)) {
+			struct perf_evsel *next = perf_evlist__next(evlist, evsel);
+
+			perf_evlist__remove(evlist, evsel);
+			/* Keep idx contiguous */
+			if (next)
+				list_for_each_entry_from(next, &evlist->entries, node)
+					next->idx--;
+		}
 	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
 		(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
 		/*
@@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 
 static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
 {
-	struct perf_evsel *evsel;
+	struct perf_evsel *evsel, *n;
 
 	evlist->needs_map_propagation = true;
 
-	perf_evlist__for_each_evsel(evlist, evsel)
+	list_for_each_entry_safe(evsel, n, &evlist->entries, node)
 		__perf_evlist__propagate_maps(evlist, evsel);
 }
 
-- 
2.34.1
Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by Arnaldo Carvalho de Melo 1 year, 2 months ago
On Thu, Nov 14, 2024 at 04:04:48PM +0000, James Clark wrote:
> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> results in an error because Perf tries to open an extended type event
> on "any" CPU which isn't valid. Extended type events can only be opened
> on CPUs that match the type.
 
> Before (working):
 
>   $ perf record --cpu 1 -- true
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
 
> After (not working):
 
>   $ perf record -C 1 -- true
>   WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>   Error:
>   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>   /bin/dmesg | grep -i perf may provide additional information.
> 
> (Ignore the warning message, that's expected and not particularly
> relevant to this issue).
> 
> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> CPU map. However for the purposes of opening an event, libperf converts
> empty CPU maps into an any CPU (-1) which the kernel rejects.
> 
> Fix it by deleting evsels with empty CPU maps in the specific case where
> user requested CPU maps are evaluated.


Namhyung, I think this should go via perf-tools,

Before, on an hybrid Intel processor:

  ⬢ [acme@toolbox perf-tools]$ grep -m1 'model name' /proc/cpuinfo 
  model name	: 13th Gen Intel(R) Core(TM) i7-1365U
  ⬢ [acme@toolbox perf-tools]$

  root@x1:~# perf record -C 1 -- true
  WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles/P).
  "dmesg | grep -i perf" may provide additional information.
  root@x1:~# 

After the patch:

  root@x1:~# perf record -C 1 -- true
  WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.369 MB perf.data (8 samples) ]
  root@x1:~#
  root@x1:~# perf evlist -v
  cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000000, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
  dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  root@x1:~#

All on CPU 1:

  root@x1:~# perf report -D | grep PERF_RECORD_SAMPLE
  1 1809541166457609 0x15e410 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1ce418 period: 1 addr: 0
  1 1809541166470019 0x15e448 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0206b1 period: 1 addr: 0
  1 1809541166475543 0x15e480 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 5 addr: 0
  1 1809541166478316 0x15e4b8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 53 addr: 0
  1 1809541166480879 0x15e4f0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 667 addr: 0
  1 1809541166483895 0x15e528 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1bcbcc period: 8537 addr: 0
  1 1809541166489116 0x15e560 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a009b3b period: 98429 addr: 0
  1 1809541168375508 0x15e620 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffff9b0e2a34 period: 732596 addr: 0
  root@x1:~#

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

James, the second patch isn't applying to perf-tools/perf-tools.

- Arnaldo
 
> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> Reviewed-by: Ian Rogers <irogers@google.com>
> Tested-by: Thomas Falcon <thomas.falcon@intel.com>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  tools/lib/perf/evlist.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index c6d67fc9e57e..83c43dc13313 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  		 */
>  		perf_cpu_map__put(evsel->cpus);
>  		evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> +
> +		/*
> +		 * Empty cpu lists would eventually get opened as "any" so remove
> +		 * genuinely empty ones before they're opened in the wrong place.
> +		 */
> +		if (perf_cpu_map__is_empty(evsel->cpus)) {
> +			struct perf_evsel *next = perf_evlist__next(evlist, evsel);
> +
> +			perf_evlist__remove(evlist, evsel);
> +			/* Keep idx contiguous */
> +			if (next)
> +				list_for_each_entry_from(next, &evlist->entries, node)
> +					next->idx--;
> +		}
>  	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
>  		(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
>  		/*
> @@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  
>  static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>  {
> -	struct perf_evsel *evsel;
> +	struct perf_evsel *evsel, *n;
>  
>  	evlist->needs_map_propagation = true;
>  
> -	perf_evlist__for_each_evsel(evlist, evsel)
> +	list_for_each_entry_safe(evsel, n, &evlist->entries, node)
>  		__perf_evlist__propagate_maps(evlist, evsel);
>  }
>  
> -- 
> 2.34.1
Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by James Clark 1 year, 2 months ago

On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> On Thu, Nov 14, 2024 at 04:04:48PM +0000, James Clark wrote:
>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>> results in an error because Perf tries to open an extended type event
>> on "any" CPU which isn't valid. Extended type events can only be opened
>> on CPUs that match the type.
>   
>> Before (working):
>   
>>    $ perf record --cpu 1 -- true
>>    [ perf record: Woken up 1 times to write data ]
>>    [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>   
>> After (not working):
>   
>>    $ perf record -C 1 -- true
>>    WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>>    Error:
>>    The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>>    /bin/dmesg | grep -i perf may provide additional information.
>>
>> (Ignore the warning message, that's expected and not particularly
>> relevant to this issue).
>>
>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>> CPU map. However for the purposes of opening an event, libperf converts
>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>>
>> Fix it by deleting evsels with empty CPU maps in the specific case where
>> user requested CPU maps are evaluated.
> 
> 
> Namhyung, I think this should go via perf-tools,
> 
> Before, on an hybrid Intel processor:
> 
>    ⬢ [acme@toolbox perf-tools]$ grep -m1 'model name' /proc/cpuinfo
>    model name	: 13th Gen Intel(R) Core(TM) i7-1365U
>    ⬢ [acme@toolbox perf-tools]$
> 
>    root@x1:~# perf record -C 1 -- true
>    WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
>    Error:
>    The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles/P).
>    "dmesg | grep -i perf" may provide additional information.
>    root@x1:~#
> 
> After the patch:
> 
>    root@x1:~# perf record -C 1 -- true
>    WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 1.369 MB perf.data (8 samples) ]
>    root@x1:~#
>    root@x1:~# perf evlist -v
>    cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000000, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
>    dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
>    root@x1:~#
> 
> All on CPU 1:
> 
>    root@x1:~# perf report -D | grep PERF_RECORD_SAMPLE
>    1 1809541166457609 0x15e410 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1ce418 period: 1 addr: 0
>    1 1809541166470019 0x15e448 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0206b1 period: 1 addr: 0
>    1 1809541166475543 0x15e480 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 5 addr: 0
>    1 1809541166478316 0x15e4b8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 53 addr: 0
>    1 1809541166480879 0x15e4f0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 667 addr: 0
>    1 1809541166483895 0x15e528 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1bcbcc period: 8537 addr: 0
>    1 1809541166489116 0x15e560 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a009b3b period: 98429 addr: 0
>    1 1809541168375508 0x15e620 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffff9b0e2a34 period: 732596 addr: 0
>    root@x1:~#
> 
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> James, the second patch isn't applying to perf-tools/perf-tools.
> 
> - Arnaldo
>   

The second one applies on 
https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd 
because that also does an evlist__remove() which gets fixed up. But the 
first one is ok to go in on its own.

>> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
>> Reviewed-by: Ian Rogers <irogers@google.com>
>> Tested-by: Thomas Falcon <thomas.falcon@intel.com>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>>   tools/lib/perf/evlist.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index c6d67fc9e57e..83c43dc13313 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>>   		 */
>>   		perf_cpu_map__put(evsel->cpus);
>>   		evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
>> +
>> +		/*
>> +		 * Empty cpu lists would eventually get opened as "any" so remove
>> +		 * genuinely empty ones before they're opened in the wrong place.
>> +		 */
>> +		if (perf_cpu_map__is_empty(evsel->cpus)) {
>> +			struct perf_evsel *next = perf_evlist__next(evlist, evsel);
>> +
>> +			perf_evlist__remove(evlist, evsel);
>> +			/* Keep idx contiguous */
>> +			if (next)
>> +				list_for_each_entry_from(next, &evlist->entries, node)
>> +					next->idx--;
>> +		}
>>   	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
>>   		(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
>>   		/*
>> @@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>>   
>>   static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>>   {
>> -	struct perf_evsel *evsel;
>> +	struct perf_evsel *evsel, *n;
>>   
>>   	evlist->needs_map_propagation = true;
>>   
>> -	perf_evlist__for_each_evsel(evlist, evsel)
>> +	list_for_each_entry_safe(evsel, n, &evlist->entries, node)
>>   		__perf_evlist__propagate_maps(evlist, evsel);
>>   }
>>   
>> -- 
>> 2.34.1

Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by Arnaldo Carvalho de Melo 1 year, 2 months ago
On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > James, the second patch isn't applying to perf-tools/perf-tools.
 
> The second one applies on
> https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> because that also does an evlist__remove() which gets fixed up.

Right, I have to test that series on the ARM machines I have access to,
but there is a question from a tester that is waiting for a reply, I'll
see if I can reproduce that problem as well.

> But the first one is ok to go in on its own.
 
Agreed.

- Arnaldo
Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by Namhyung Kim 1 year, 2 months ago
On Tue, Dec 10, 2024 at 11:07:13AM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> > On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > > James, the second patch isn't applying to perf-tools/perf-tools.
>  
> > The second one applies on
> > https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> > because that also does an evlist__remove() which gets fixed up.
> 
> Right, I have to test that series on the ARM machines I have access to,
> but there is a question from a tester that is waiting for a reply, I'll
> see if I can reproduce that problem as well.
> 
> > But the first one is ok to go in on its own.
>  
> Agreed.

Ok, I'll pick this one to perf-tools and leave the patch 2 go into
perf-tools-next.

Thanks,
Namhyung
Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
Posted by Namhyung Kim 1 year, 1 month ago
On Tue, Dec 10, 2024 at 10:10:21AM -0800, Namhyung Kim wrote:
> On Tue, Dec 10, 2024 at 11:07:13AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> > > On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > > > James, the second patch isn't applying to perf-tools/perf-tools.
> >  
> > > The second one applies on
> > > https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> > > because that also does an evlist__remove() which gets fixed up.
> > 
> > Right, I have to test that series on the ARM machines I have access to,
> > but there is a question from a tester that is waiting for a reply, I'll
> > see if I can reproduce that problem as well.
> > 
> > > But the first one is ok to go in on its own.
> >  
> > Agreed.
> 
> Ok, I'll pick this one to perf-tools and leave the patch 2 go into
> perf-tools-next.

Applied patch 1 to perf-tools, thanks!

Best regards,
Namhyung