perf stat: a set of small fixes for bperf

[PATCH -next 1/2] perf stat: Increase perf_attr_map entries

Posted by Tengda Wu 2 months ago

bperf restricts the size of perf_attr_map's entries to 16, which
cannot hold all events in many scenarios. A typical example is
when the user specifies `-a -ddd` ([0]). And in other cases such as
top-down analysis, which often requires a set of more than 16 PMUs
to be collected simultaneously.

Fix this by increase perf_attr_map entries to 100, and an event
number check has been introduced when bperf__load() to ensure that
users receive a more friendly prompt when the event limit is reached.

  [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@kernel.org/

Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
---
 tools/perf/util/bpf_counter.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 7a8af60e0f51..3346129c20cf 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -28,7 +28,7 @@
 #include "bpf_skel/bperf_leader.skel.h"
 #include "bpf_skel/bperf_follower.skel.h"
 
-#define ATTR_MAP_SIZE 16
+#define ATTR_MAP_SIZE 100
 
 static inline void *u64_to_ptr(__u64 ptr)
 {
@@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
 	enum bperf_filter_type filter_type;
 	__u32 filter_entry_cnt, i;
 
+	if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) {
+		pr_err("Too many events, please limit to %d or less\n",
+			ATTR_MAP_SIZE);
+		return -1;
+	}
+
 	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
 		return -1;
 
-- 
2.34.1

Re: [PATCH -next 1/2] perf stat: Increase perf_attr_map entries

Posted by Namhyung Kim 2 months ago

On Wed, Sep 25, 2024 at 01:55:22PM +0000, Tengda Wu wrote:
> bperf restricts the size of perf_attr_map's entries to 16, which
> cannot hold all events in many scenarios. A typical example is
> when the user specifies `-a -ddd` ([0]). And in other cases such as
> top-down analysis, which often requires a set of more than 16 PMUs
> to be collected simultaneously.
> 
> Fix this by increase perf_attr_map entries to 100, and an event
> number check has been introduced when bperf__load() to ensure that
> users receive a more friendly prompt when the event limit is reached.
> 
>   [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@kernel.org/

Apparently this patch was never applied.  I don't know how much you need
but having too many events at the same time won't be very useful because
multiplexing could reduce the accuracy.

Thanks,
Namhyung

> 
> Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
> ---
>  tools/perf/util/bpf_counter.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> index 7a8af60e0f51..3346129c20cf 100644
> --- a/tools/perf/util/bpf_counter.c
> +++ b/tools/perf/util/bpf_counter.c
> @@ -28,7 +28,7 @@
>  #include "bpf_skel/bperf_leader.skel.h"
>  #include "bpf_skel/bperf_follower.skel.h"
>  
> -#define ATTR_MAP_SIZE 16
> +#define ATTR_MAP_SIZE 100
>  
>  static inline void *u64_to_ptr(__u64 ptr)
>  {
> @@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
>  	enum bperf_filter_type filter_type;
>  	__u32 filter_entry_cnt, i;
>  
> +	if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) {
> +		pr_err("Too many events, please limit to %d or less\n",
> +			ATTR_MAP_SIZE);
> +		return -1;
> +	}
> +
>  	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
>  		return -1;
>  
> -- 
> 2.34.1
>

Re: [PATCH -next 1/2] perf stat: Increase perf_attr_map entries

Posted by Tengda Wu 2 months ago


On 2024/9/26 12:16, Namhyung Kim wrote:
> On Wed, Sep 25, 2024 at 01:55:22PM +0000, Tengda Wu wrote:
>> bperf restricts the size of perf_attr_map's entries to 16, which
>> cannot hold all events in many scenarios. A typical example is
>> when the user specifies `-a -ddd` ([0]). And in other cases such as
>> top-down analysis, which often requires a set of more than 16 PMUs
>> to be collected simultaneously.
>>
>> Fix this by increase perf_attr_map entries to 100, and an event
>> number check has been introduced when bperf__load() to ensure that
>> users receive a more friendly prompt when the event limit is reached.
>>
>>   [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@kernel.org/
> 
> Apparently this patch was never applied.  I don't know how much you need
> but having too many events at the same time won't be very useful because
> multiplexing could reduce the accuracy.
> 

Could you please explain why patch [0] was not merged at that time? I couldn't
find this information from the previous emails.

In my scenario, we collect more than 40+ events to support necessary metric
calculations, which multiplexing is inevitable. Although multiplexing may
reduce accuracy, for the purpose of supporting metric calculations, these
accuracy losses can be acceptable. Perf also has the same issue with multiplexing.
Removing the event limit for bperf can provide users with additional options.

In addition to accuracy, we also care about overhead. I compared the overhead
of bperf and perf by testing ./lat_ctx in lmbench [1], and found that the
overhead of bperf stat is about 4% less than perf. This is why we choose to
use bperf in some extreme scenarios.

  [1] https://github.com/intel/lmbench

Thanks,
Tengda

> 
>>
>> Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>> ---
>>  tools/perf/util/bpf_counter.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
>> index 7a8af60e0f51..3346129c20cf 100644
>> --- a/tools/perf/util/bpf_counter.c
>> +++ b/tools/perf/util/bpf_counter.c
>> @@ -28,7 +28,7 @@
>>  #include "bpf_skel/bperf_leader.skel.h"
>>  #include "bpf_skel/bperf_follower.skel.h"
>>  
>> -#define ATTR_MAP_SIZE 16
>> +#define ATTR_MAP_SIZE 100
>>  
>>  static inline void *u64_to_ptr(__u64 ptr)
>>  {
>> @@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
>>  	enum bperf_filter_type filter_type;
>>  	__u32 filter_entry_cnt, i;
>>  
>> +	if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) {
>> +		pr_err("Too many events, please limit to %d or less\n",
>> +			ATTR_MAP_SIZE);
>> +		return -1;
>> +	}
>> +
>>  	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
>>  		return -1;
>>  
>> -- 
>> 2.34.1
>>

Re: [PATCH -next 1/2] perf stat: Increase perf_attr_map entries

Posted by Namhyung Kim 2 months ago

On Fri, Sep 27, 2024 at 10:35:54AM +0800, Tengda Wu wrote:
> 
> 
> On 2024/9/26 12:16, Namhyung Kim wrote:
> > On Wed, Sep 25, 2024 at 01:55:22PM +0000, Tengda Wu wrote:
> >> bperf restricts the size of perf_attr_map's entries to 16, which
> >> cannot hold all events in many scenarios. A typical example is
> >> when the user specifies `-a -ddd` ([0]). And in other cases such as
> >> top-down analysis, which often requires a set of more than 16 PMUs
> >> to be collected simultaneously.
> >>
> >> Fix this by increase perf_attr_map entries to 100, and an event
> >> number check has been introduced when bperf__load() to ensure that
> >> users receive a more friendly prompt when the event limit is reached.
> >>
> >>   [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@kernel.org/
> > 
> > Apparently this patch was never applied.  I don't know how much you need
> > but having too many events at the same time won't be very useful because
> > multiplexing could reduce the accuracy.
> > 
> 
> Could you please explain why patch [0] was not merged at that time? I couldn't
> find this information from the previous emails.

I guess it's just fell through the crack. :)

> 
> In my scenario, we collect more than 40+ events to support necessary metric
> calculations, which multiplexing is inevitable. Although multiplexing may
> reduce accuracy, for the purpose of supporting metric calculations, these
> accuracy losses can be acceptable. Perf also has the same issue with multiplexing.
> Removing the event limit for bperf can provide users with additional options.
> 
> In addition to accuracy, we also care about overhead. I compared the overhead
> of bperf and perf by testing ./lat_ctx in lmbench [1], and found that the
> overhead of bperf stat is about 4% less than perf. This is why we choose to
> use bperf in some extreme scenarios.

Ok, thanks for explanation.  I think it's ok to increase the limit.

Thanks,
Namhyung

> 
>   [1] https://github.com/intel/lmbench
> 
> Thanks,
> Tengda
> 
> > 
> >>
> >> Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
> >> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
> >> ---
> >>  tools/perf/util/bpf_counter.c | 8 +++++++-
> >>  1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> >> index 7a8af60e0f51..3346129c20cf 100644
> >> --- a/tools/perf/util/bpf_counter.c
> >> +++ b/tools/perf/util/bpf_counter.c
> >> @@ -28,7 +28,7 @@
> >>  #include "bpf_skel/bperf_leader.skel.h"
> >>  #include "bpf_skel/bperf_follower.skel.h"
> >>  
> >> -#define ATTR_MAP_SIZE 16
> >> +#define ATTR_MAP_SIZE 100
> >>  
> >>  static inline void *u64_to_ptr(__u64 ptr)
> >>  {
> >> @@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
> >>  	enum bperf_filter_type filter_type;
> >>  	__u32 filter_entry_cnt, i;
> >>  
> >> +	if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) {
> >> +		pr_err("Too many events, please limit to %d or less\n",
> >> +			ATTR_MAP_SIZE);
> >> +		return -1;
> >> +	}
> >> +
> >>  	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
> >>  		return -1;
> >>  
> >> -- 
> >> 2.34.1
> >>
>

Re: [PATCH -next 1/2] perf stat: Increase perf_attr_map entries

Posted by Tengda Wu 2 months ago


On 2024/9/28 1:12, Namhyung Kim wrote:
> On Fri, Sep 27, 2024 at 10:35:54AM +0800, Tengda Wu wrote:
>>
>>
>> On 2024/9/26 12:16, Namhyung Kim wrote:
>>> On Wed, Sep 25, 2024 at 01:55:22PM +0000, Tengda Wu wrote:
>>>> bperf restricts the size of perf_attr_map's entries to 16, which
>>>> cannot hold all events in many scenarios. A typical example is
>>>> when the user specifies `-a -ddd` ([0]). And in other cases such as
>>>> top-down analysis, which often requires a set of more than 16 PMUs
>>>> to be collected simultaneously.
>>>>
>>>> Fix this by increase perf_attr_map entries to 100, and an event
>>>> number check has been introduced when bperf__load() to ensure that
>>>> users receive a more friendly prompt when the event limit is reached.
>>>>
>>>>   [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@kernel.org/
>>>
>>> Apparently this patch was never applied.  I don't know how much you need
>>> but having too many events at the same time won't be very useful because
>>> multiplexing could reduce the accuracy.
>>>
>>
>> Could you please explain why patch [0] was not merged at that time? I couldn't
>> find this information from the previous emails.
> 
> I guess it's just fell through the crack. :)

Hope it won't happen again. 😆

> 
>>
>> In my scenario, we collect more than 40+ events to support necessary metric
>> calculations, which multiplexing is inevitable. Although multiplexing may
>> reduce accuracy, for the purpose of supporting metric calculations, these
>> accuracy losses can be acceptable. Perf also has the same issue with multiplexing.
>> Removing the event limit for bperf can provide users with additional options.
>>
>> In addition to accuracy, we also care about overhead. I compared the overhead
>> of bperf and perf by testing ./lat_ctx in lmbench [1], and found that the
>> overhead of bperf stat is about 4% less than perf. This is why we choose to
>> use bperf in some extreme scenarios.
> 
> Ok, thanks for explanation.  I think it's ok to increase the limit.
> 
> Thanks,
> Namhyung
> 
>>
>>   [1] https://github.com/intel/lmbench
>>
>> Thanks,
>> Tengda
>>
>>>
>>>>
>>>> Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
>>>> Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
>>>> ---
>>>>  tools/perf/util/bpf_counter.c | 8 +++++++-
>>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
>>>> index 7a8af60e0f51..3346129c20cf 100644
>>>> --- a/tools/perf/util/bpf_counter.c
>>>> +++ b/tools/perf/util/bpf_counter.c
>>>> @@ -28,7 +28,7 @@
>>>>  #include "bpf_skel/bperf_leader.skel.h"
>>>>  #include "bpf_skel/bperf_follower.skel.h"
>>>>  
>>>> -#define ATTR_MAP_SIZE 16
>>>> +#define ATTR_MAP_SIZE 100
>>>>  
>>>>  static inline void *u64_to_ptr(__u64 ptr)
>>>>  {
>>>> @@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target)
>>>>  	enum bperf_filter_type filter_type;
>>>>  	__u32 filter_entry_cnt, i;
>>>>  
>>>> +	if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) {
>>>> +		pr_err("Too many events, please limit to %d or less\n",
>>>> +			ATTR_MAP_SIZE);
>>>> +		return -1;
>>>> +	}
>>>> +
>>>>  	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
>>>>  		return -1;
>>>>  
>>>> -- 
>>>> 2.34.1
>>>>
>>