arch/x86/events/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Perf doesn't work at perf stat for hardware events:
$perf stat -- sleep 1
Performance counter stats for 'sleep 1':
16.44 msec task-clock # 0.016 CPUs utilized
2 context-switches # 121.691 /sec
0 cpu-migrations # 0.000 /sec
54 page-faults # 3.286 K/sec
<not supported> cycles
<not supported> instructions
<not supported> branches
<not supported> branch-misses
The reason is that the check in x86_pmu_hw_config for sampling event is
unexpectedly applied to the counting event.
Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
---
arch/x86/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6866cc5acb0b..3a4f031d2f44 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
if (event->attr.type == event->pmu->type)
event->hw.config |= x86_pmu_get_event_config(event);
- if (!event->attr.freq && x86_pmu.limit_period) {
+ if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
s64 left = event->attr.sample_period;
x86_pmu.limit_period(event, &left);
if (left > event->attr.sample_period)
--
2.34.1
* Luo Gengkun <luogengkun@huaweicloud.com> wrote:
> Perf doesn't work at perf stat for hardware events:
>
> $perf stat -- sleep 1
> Performance counter stats for 'sleep 1':
> 16.44 msec task-clock # 0.016 CPUs utilized
> 2 context-switches # 121.691 /sec
> 0 cpu-migrations # 0.000 /sec
> 54 page-faults # 3.286 K/sec
> <not supported> cycles
> <not supported> instructions
> <not supported> branches
> <not supported> branch-misses
>
> The reason is that the check in x86_pmu_hw_config for sampling event is
> unexpectedly applied to the counting event.
>
> Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
> ---
> arch/x86/events/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 6866cc5acb0b..3a4f031d2f44 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
> if (event->attr.type == event->pmu->type)
> event->hw.config |= x86_pmu_get_event_config(event);
>
> - if (!event->attr.freq && x86_pmu.limit_period) {
> + if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
Hm, so how come it works here, on an affected x86 system:
$ perf stat -- sleep 1
Performance counter stats for 'sleep 1':
0.64 msec task-clock:u # 0.001 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
73 page-faults:u # 114.063 K/sec
325,849 instructions:u # 0.56 insn per cycle
# 0.88 stalled cycles per insn
580,323 cycles:u # 0.907 GHz
286,348 stalled-cycles-frontend:u # 49.34% frontend cycles idle
72,623 branches:u # 113.474 M/sec
4,713 branch-misses:u # 6.49% of all branches
?
Thanks,
Ingo
On 2025-04-24 12:20 p.m., Ingo Molnar wrote:
>
> * Luo Gengkun <luogengkun@huaweicloud.com> wrote:
>
>> Perf doesn't work at perf stat for hardware events:
>>
>> $perf stat -- sleep 1
>> Performance counter stats for 'sleep 1':
>> 16.44 msec task-clock # 0.016 CPUs utilized
>> 2 context-switches # 121.691 /sec
>> 0 cpu-migrations # 0.000 /sec
>> 54 page-faults # 3.286 K/sec
>> <not supported> cycles
>> <not supported> instructions
>> <not supported> branches
>> <not supported> branch-misses
>>
>> The reason is that the check in x86_pmu_hw_config for sampling event is
>> unexpectedly applied to the counting event.
>>
>> Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
>> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
>> ---
>> arch/x86/events/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
>> index 6866cc5acb0b..3a4f031d2f44 100644
>> --- a/arch/x86/events/core.c
>> +++ b/arch/x86/events/core.c
>> @@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
>> if (event->attr.type == event->pmu->type)
>> event->hw.config |= x86_pmu_get_event_config(event);
>>
>> - if (!event->attr.freq && x86_pmu.limit_period) {
>> + if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
>
> Hm, so how come it works here, on an affected x86 system:
>
> $ perf stat -- sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 0.64 msec task-clock:u # 0.001 CPUs utilized
> 0 context-switches:u # 0.000 /sec
> 0 cpu-migrations:u # 0.000 /sec
> 73 page-faults:u # 114.063 K/sec
> 325,849 instructions:u # 0.56 insn per cycle
> # 0.88 stalled cycles per insn
> 580,323 cycles:u # 0.907 GHz
> 286,348 stalled-cycles-frontend:u # 49.34% frontend cycles idle
> 72,623 branches:u # 113.474 M/sec
> 4,713 branch-misses:u # 6.49% of all branches
>
>
> ?
It doesn't affect all X86 platforms. It should only impact the platforms
with limit_period used for the non-pebs events. For Intel platforms, it
should only impact some older platforms, e.g., HSW, BDW and NHM.
For other platforms, the x86_pmu.limit_period is invoked. But the left
is not updated. So it still equals to event->attr.sample_period.
It doesn't error out.
if (!event->attr.freq && x86_pmu.limit_period) {
s64 left = event->attr.sample_period;
x86_pmu.limit_period(event, &left);
if (left > event->attr.sample_period)
return -EINVAL;
}
Thanks,
Kan
* Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
> On 2025-04-24 12:20 p.m., Ingo Molnar wrote:
> >
> > * Luo Gengkun <luogengkun@huaweicloud.com> wrote:
> >
> >> Perf doesn't work at perf stat for hardware events:
> >>
> >> $perf stat -- sleep 1
> >> Performance counter stats for 'sleep 1':
> >> 16.44 msec task-clock # 0.016 CPUs utilized
> >> 2 context-switches # 121.691 /sec
> >> 0 cpu-migrations # 0.000 /sec
> >> 54 page-faults # 3.286 K/sec
> >> <not supported> cycles
> >> <not supported> instructions
> >> <not supported> branches
> >> <not supported> branch-misses
> >>
> >> The reason is that the check in x86_pmu_hw_config for sampling event is
> >> unexpectedly applied to the counting event.
> >>
> >> Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
> >> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
> >> ---
> >> arch/x86/events/core.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> >> index 6866cc5acb0b..3a4f031d2f44 100644
> >> --- a/arch/x86/events/core.c
> >> +++ b/arch/x86/events/core.c
> >> @@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
> >> if (event->attr.type == event->pmu->type)
> >> event->hw.config |= x86_pmu_get_event_config(event);
> >>
> >> - if (!event->attr.freq && x86_pmu.limit_period) {
> >> + if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
> >
> > Hm, so how come it works here, on an affected x86 system:
> >
> > $ perf stat -- sleep 1
> >
> > Performance counter stats for 'sleep 1':
> >
> > 0.64 msec task-clock:u # 0.001 CPUs utilized
> > 0 context-switches:u # 0.000 /sec
> > 0 cpu-migrations:u # 0.000 /sec
> > 73 page-faults:u # 114.063 K/sec
> > 325,849 instructions:u # 0.56 insn per cycle
> > # 0.88 stalled cycles per insn
> > 580,323 cycles:u # 0.907 GHz
> > 286,348 stalled-cycles-frontend:u # 49.34% frontend cycles idle
> > 72,623 branches:u # 113.474 M/sec
> > 4,713 branch-misses:u # 6.49% of all branches
> >
> >
> > ?
>
> It doesn't affect all X86 platforms. It should only impact the platforms
> with limit_period used for the non-pebs events. For Intel platforms, it
> should only impact some older platforms, e.g., HSW, BDW and NHM.
>
> For other platforms, the x86_pmu.limit_period is invoked. But the left
> is not updated. So it still equals to event->attr.sample_period.
> It doesn't error out.
>
> if (!event->attr.freq && x86_pmu.limit_period) {
> s64 left = event->attr.sample_period;
> x86_pmu.limit_period(event, &left);
> if (left > event->attr.sample_period)
> return -EINVAL;
> }
Makes sense. I've added this paragraph to the changelog:
It should only impact x86 platforms with limit_period used for non-PEBS
events. For Intel platforms, it should only impact some older platforms,
e.g., HSW, BDW and NHM.
Thanks,
Ingo
On 23-Apr-25 12:17 PM, Luo Gengkun wrote:
> Perf doesn't work at perf stat for hardware events:
>
> $perf stat -- sleep 1
> Performance counter stats for 'sleep 1':
> 16.44 msec task-clock # 0.016 CPUs utilized
> 2 context-switches # 121.691 /sec
> 0 cpu-migrations # 0.000 /sec
> 54 page-faults # 3.286 K/sec
> <not supported> cycles
> <not supported> instructions
> <not supported> branches
> <not supported> branch-misses
Wondering if it is worth to add this in perf test. Something like
below?
--- a/tools/perf/tests/shell/stat.sh
+++ b/tools/perf/tests/shell/stat.sh
@@ -16,6 +16,24 @@ test_default_stat() {
echo "Basic stat command test [Success]"
}
+test_stat_count() {
+ echo "stat count test"
+
+ if ! perf list | grep -q "cpu-cycles OR cycles"
+ then
+ echo "stat count test [Skipped cpu-cycles event missing]"
+ return
+ fi
+
+ if perf stat -e cycles true 2>&1 | grep -E -q "<not supported>"
+ then
+ echo "stat count test [Failed]"
+ err=1
+ return
+ fi
+ echo "stat count test [Success]"
+}
+
test_stat_record_report() {
echo "stat record and report test"
if ! perf stat record -o - true | perf stat report -i - 2>&1 | \
@@ -201,6 +219,7 @@ test_hybrid() {
}
test_default_stat
+test_stat_count
test_stat_record_report
test_stat_record_script
test_stat_repeat_weak_groups
---
Thanks,
Ravi
On 2025-04-24 2:52 a.m., Ravi Bangoria wrote:
> On 23-Apr-25 12:17 PM, Luo Gengkun wrote:
>> Perf doesn't work at perf stat for hardware events:
>>
>> $perf stat -- sleep 1
>> Performance counter stats for 'sleep 1':
>> 16.44 msec task-clock # 0.016 CPUs utilized
>> 2 context-switches # 121.691 /sec
>> 0 cpu-migrations # 0.000 /sec
>> 54 page-faults # 3.286 K/sec
>> <not supported> cycles
>> <not supported> instructions
>> <not supported> branches
>> <not supported> branch-misses
>
> Wondering if it is worth to add this in perf test. Something like
> below?
>
> --- a/tools/perf/tests/shell/stat.sh
> +++ b/tools/perf/tests/shell/stat.sh
> @@ -16,6 +16,24 @@ test_default_stat() {
> echo "Basic stat command test [Success]"
> }
>
> +test_stat_count() {
> + echo "stat count test"
> +
> + if ! perf list | grep -q "cpu-cycles OR cycles"
> + then
> + echo "stat count test [Skipped cpu-cycles event missing]"
> + return
> + fi
> +
> + if perf stat -e cycles true 2>&1 | grep -E -q "<not supported>"
> + then
> + echo "stat count test [Failed]"
> + err=1
> + return
> + fi
> + echo "stat count test [Success]"
> +}
> +
> test_stat_record_report() {
> echo "stat record and report test"
> if ! perf stat record -o - true | perf stat report -i - 2>&1 | \
> @@ -201,6 +219,7 @@ test_hybrid() {
> }
>
> test_default_stat
> +test_stat_count
I think the perf stat default should always be supported, not just cycles.
Maybe we should add the check in test_default_stat?
Thanks,
Kan> test_stat_record_report
> test_stat_record_script
> test_stat_repeat_weak_groups
> ---
>
> Thanks,
> Ravi
>
Hi Kan,
>>> Perf doesn't work at perf stat for hardware events:
>>>
>>> $perf stat -- sleep 1
>>> Performance counter stats for 'sleep 1':
>>> 16.44 msec task-clock # 0.016 CPUs utilized
>>> 2 context-switches # 121.691 /sec
>>> 0 cpu-migrations # 0.000 /sec
>>> 54 page-faults # 3.286 K/sec
>>> <not supported> cycles
>>> <not supported> instructions
>>> <not supported> branches
>>> <not supported> branch-misses
>>
>> Wondering if it is worth to add this in perf test. Something like
>> below?
>>
>> --- a/tools/perf/tests/shell/stat.sh
>> +++ b/tools/perf/tests/shell/stat.sh
>> @@ -16,6 +16,24 @@ test_default_stat() {
>> echo "Basic stat command test [Success]"
>> }
>>
>> +test_stat_count() {
>> + echo "stat count test"
>> +
>> + if ! perf list | grep -q "cpu-cycles OR cycles"
>> + then
>> + echo "stat count test [Skipped cpu-cycles event missing]"
>> + return
>> + fi
>> +
>> + if perf stat -e cycles true 2>&1 | grep -E -q "<not supported>"
>> + then
>> + echo "stat count test [Failed]"
>> + err=1
>> + return
>> + fi
>> + echo "stat count test [Success]"
>> +}
>> +
>> test_stat_record_report() {
>> echo "stat record and report test"
>> if ! perf stat record -o - true | perf stat report -i - 2>&1 | \
>> @@ -201,6 +219,7 @@ test_hybrid() {
>> }
>>
>> test_default_stat
>> +test_stat_count
>
> I think the perf stat default should always be supported, not just cycles.
> Maybe we should add the check in test_default_stat?
Do you mean:
if perf stat true 2>&1 | grep -E -q "<not supported>"
err=1
Isn't this ambiguous? Also, this fails on machines where HW pmu
is not supported. For ex, on my qemu guest with `-cpu pmu=off`:
$ ./perf list | grep "cpu-cycles OR cycles"
<empty output>
$ ./perf stat true
Performance counter stats for 'true':
0.42 msec task-clock:u # 0.470 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
48 page-faults:u # 113.874 K/sec
<not supported> cycles:u
Thanks,
Ravi
Hi Ravi,
Sorry for the late response. I was on vacation.
On 2025-04-25 6:12 a.m., Ravi Bangoria wrote:
> Hi Kan,
>
>>>> Perf doesn't work at perf stat for hardware events:
>>>>
>>>> $perf stat -- sleep 1
>>>> Performance counter stats for 'sleep 1':
>>>> 16.44 msec task-clock # 0.016 CPUs utilized
>>>> 2 context-switches # 121.691 /sec
>>>> 0 cpu-migrations # 0.000 /sec
>>>> 54 page-faults # 3.286 K/sec
>>>> <not supported> cycles
>>>> <not supported> instructions
>>>> <not supported> branches
>>>> <not supported> branch-misses
>>>
>>> Wondering if it is worth to add this in perf test. Something like
>>> below?
>>>
>>> --- a/tools/perf/tests/shell/stat.sh
>>> +++ b/tools/perf/tests/shell/stat.sh
>>> @@ -16,6 +16,24 @@ test_default_stat() {
>>> echo "Basic stat command test [Success]"
>>> }
>>>
>>> +test_stat_count() {
>>> + echo "stat count test"
>>> +
>>> + if ! perf list | grep -q "cpu-cycles OR cycles"
>>> + then
>>> + echo "stat count test [Skipped cpu-cycles event missing]"
>>> + return
>>> + fi
>>> +
>>> + if perf stat -e cycles true 2>&1 | grep -E -q "<not supported>"
>>> + then
>>> + echo "stat count test [Failed]"
>>> + err=1
>>> + return
>>> + fi
>>> + echo "stat count test [Success]"
>>> +}
>>> +
>>> test_stat_record_report() {
>>> echo "stat record and report test"
>>> if ! perf stat record -o - true | perf stat report -i - 2>&1 | \
>>> @@ -201,6 +219,7 @@ test_hybrid() {
>>> }
>>>
>>> test_default_stat
>>> +test_stat_count
>>
>> I think the perf stat default should always be supported, not just cycles.
>> Maybe we should add the check in test_default_stat?
>
> Do you mean:
>
> if perf stat true 2>&1 | grep -E -q "<not supported>"
> err=1
>
Yes, I assumed that all the events in the perf stat are always
available. But it seems the assumption is only true for bare metal.
> Isn't this ambiguous? Also, this fails on machines where HW pmu
> is not supported. For ex, on my qemu guest with `-cpu pmu=off`:
>
> $ ./perf list | grep "cpu-cycles OR cycles"
> <empty output>
>
> $ ./perf stat true
> Performance counter stats for 'true':
> 0.42 msec task-clock:u # 0.470 CPUs utilized
> 0 context-switches:u # 0.000 /sec
> 0 cpu-migrations:u # 0.000 /sec
> 48 page-faults:u # 113.874 K/sec
> <not supported> cycles:u
>
If taking the virtualization into account, the test_stat_count looks
good to me.
Thanks,
Kan
On 2025-04-23 2:47 a.m., Luo Gengkun wrote:
> Perf doesn't work at perf stat for hardware events:
>
> $perf stat -- sleep 1
> Performance counter stats for 'sleep 1':
> 16.44 msec task-clock # 0.016 CPUs utilized
> 2 context-switches # 121.691 /sec
> 0 cpu-migrations # 0.000 /sec
> 54 page-faults # 3.286 K/sec
> <not supported> cycles
> <not supported> instructions
> <not supported> branches
> <not supported> branch-misses
>
> The reason is that the check in x86_pmu_hw_config for sampling event is
> unexpectedly applied to the counting event.
>
> Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Yes, it should only be applied for the sampling event. The
event->attr.freq is always 0 in the counting mode.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Thanks,
Kan> ---
> arch/x86/events/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 6866cc5acb0b..3a4f031d2f44 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
> if (event->attr.type == event->pmu->type)
> event->hw.config |= x86_pmu_get_event_config(event);
>
> - if (!event->attr.freq && x86_pmu.limit_period) {
> + if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
> s64 left = event->attr.sample_period;
> x86_pmu.limit_period(event, &left);
> if (left > event->attr.sample_period)
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: 1a97fea9db9e9b9c4839d4232dde9f505ff5b4cc
Gitweb: https://git.kernel.org/tip/1a97fea9db9e9b9c4839d4232dde9f505ff5b4cc
Author: Luo Gengkun <luogengkun@huaweicloud.com>
AuthorDate: Wed, 23 Apr 2025 06:47:24
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 24 Apr 2025 20:15:04 +02:00
perf/x86: Fix non-sampling (counting) events on certain x86 platforms
Perf doesn't work at perf stat for hardware events on certain x86 platforms:
$perf stat -- sleep 1
Performance counter stats for 'sleep 1':
16.44 msec task-clock # 0.016 CPUs utilized
2 context-switches # 121.691 /sec
0 cpu-migrations # 0.000 /sec
54 page-faults # 3.286 K/sec
<not supported> cycles
<not supported> instructions
<not supported> branches
<not supported> branch-misses
The reason is that the check in x86_pmu_hw_config() for sampling events is
unexpectedly applied to counting events as well.
It should only impact x86 platforms with limit_period used for non-PEBS
events. For Intel platforms, it should only impact some older platforms,
e.g., HSW, BDW and NHM.
Fixes: 88ec7eedbbd2 ("perf/x86: Fix low freqency setting issue")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250423064724.3716211-1-luogengkun@huaweicloud.com
---
arch/x86/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6866cc5..3a4f031 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -629,7 +629,7 @@ int x86_pmu_hw_config(struct perf_event *event)
if (event->attr.type == event->pmu->type)
event->hw.config |= x86_pmu_get_event_config(event);
- if (!event->attr.freq && x86_pmu.limit_period) {
+ if (is_sampling_event(event) && !event->attr.freq && x86_pmu.limit_period) {
s64 left = event->attr.sample_period;
x86_pmu.limit_period(event, &left);
if (left > event->attr.sample_period)
© 2016 - 2026 Red Hat, Inc.